[Task] <Support discovery monitoring instance through http_sd>
Description
- http request http_sd api to get service instance ip, port and more.
- auto create monitor to monitoring service instance throght prometheus protocol
Refer: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config https://prometheus.io/docs/prometheus/latest/http_sd/
Task List
No response
@tomsun28 Please assign this issue to me.
Hi, I feel that we need to carefully consider the following designs
- What method is used to call the http_sd api to obtain service instance? Is it a scheduled request or a one-time request? When to update service instance data.
- How to automatically create monitoring.
- more
Hi, I feel that we need to carefully consider the following designs
- What method is used to call the http_sd api to obtain service instance? Is it a scheduled request or a one-time request? When to update service instance data.
- How to automatically create monitoring.
- more
Thanks for your advice. I will check the code and consider how to design it better.
@tomsun28 Hi, I found that Prometheus uses http_sd or file_sd to discover service instances automatically in order to manage them better. Here is the illustration:
Do you mean that Hertzbeat needs to support this feature?
Like when somebody wants to creat a RocketMQ monitor, he/she could enter a http_sd endpoint to manage RocketMQ cluster instead of creating monitor for each RocketMQ instance.
Here is the illustration:
Hi, it’s like this👍👍. We discover and create monitoring instances automatically throght the http_sd endpoint response content.
I thinks there is a difficulty that to design how and when to create monitoring instances and how to dispaly them.
Hi, @tomsun28 I've spended some times for sorting process of collecting data and found that there are some existed design problems.
- All protocols in path
org.dromara.hertzbeat.common.entity.job.protocolhave no super class or interface, which is hard to define and modify base field or config in runtime. - Every single concrete collect class in path
org.dromara.hertzbeat.collector.collecthas a lot of duplicate code due toorg.dromara.hertzbeat.collector.collect.AbstractCollecthas only one method to normative behavior of subclass. Such aspreCheckParam,Remote connecting,Exception handlingand so on.
Thus, here are my thoughts:
- Create base
Protocolinterface - Redesign AbstractCollector
I write a demo as well: AbstractCollector:
public abstract class AbstractCollector<P extends CommonProtocol, D> implements HertzBeatCollector<P, D> {
private P protocol;
public void collect(CollectRep.MetricsData.Builder builder, long monitorId, String app, Metrics metrics) {
protocol = constructNewProtocol();
CollectClient<D> collectClient = null;
try {
// 检查参数
if (!isValidProtocol(protocol)) {
return;
}
//todo 对于通过http_sd定义的集群监控,可以在这里循环进行以下步骤,最后统一聚合到CollectRep.MetricsData.Builder即可
// 创建客户端
collectClient = createCollectClientTemplate(protocol);
// 连接远程客户端
collectClient.connect();
// 采集数据
processCollectedData(builder, collectClient.fetchData());
}catch (Exception exception) {
// 异常处理,随时可以实现并扩展
DefaultExceptionHandler.getInstance().onHandle(exception, builder, null);
}finally {
closeGracefully(collectClient, null);
}
}
public abstract CollectClient<D> createCollectClientTemplate(P protocol) throws Exception;
/**
* the protocol this collect instance support
* @return protocol str
*/
public abstract String supportProtocol();
private P constructNewProtocol() {
// clone a new protocol
// redefine base config here, like http_sd
// 这里直接面向CommonProtocol编码即可
return null;
}
private void closeGracefully(CollectClient<D> collectClient, Logger logger) {
if (Objects.isNull(collectClient)) {
return;
}
try {
collectClient.close();
} catch (Exception exception) {
if (Objects.nonNull(logger) && logger.isWarnEnabled()) {
logger.warn("Failed to close CollectClient: {}", CommonUtil.getMessageFromThrowable(exception));
}
}
}
}
SmtpDemo:
// Define the class of protocol and response data when you extend AbstractCollector
public class SmtpDemo extends AbstractCollector<SmtpProtocol, String> {
@Override
public boolean isValidProtocol(SmtpProtocol protocol) {
// check SmtpProtocol here...
return true;
}
@Override
public CollectClient<String> createCollectClientTemplate(SmtpProtocol protocol) {
// create client and define each action here...
String host = protocol.getHost();
String port = protocol.getPort();
int timeout = CollectUtil.getTimeout(protocol.getTimeout());
SMTP smtp = new SMTP();
smtp.setConnectTimeout(timeout);
smtp.setCharset(StandardCharsets.UTF_8);
return new CollectClient<>() {
@Override
public void connect() throws Exception {
// connect remote client...
smtp.connect(host, Integer.parseInt(port));
}
@Override
public String fetchData() throws Exception {
// fetch data from remote...
smtp.helo(protocol.getEmail());
return smtp.getReplyString();
}
@Override
public void close() throws Exception {
// close client...
smtp.disconnect();
}
};
}
@Override
public void processCollectedData(CollectRep.MetricsData.Builder builder, String collectedData) {
// transfer collected data and build HertzBeat data here...
}
@Override
public String supportProtocol() {
return null;
}
}
What are the benefits if we refactor Collector as mentioned about?
- Friendly for expanding collector, we can directly extend
AbstractCollectorand code step by step. There is not necessary to worry about any other things likeCatching Exceptionand to look for useful parameter inorg.dromara.hertzbeat.common.entity.job.Metrics, all we need to do is focus on how to monitor remote server. - Handling Exception in one place and good for extension.
- Easy for Dynamic protocol processing and aggregating data without worrying upper code change.
Before:
After:
- We can enhance collecting data easily because each action is completely divided.
It will be a dramatical refactor and needs to modify each Collector. I have no idea if it is a better design. Do you have any better suggesion?
Hi, better design, 👍👍👍 I think this is a great abstraction of collect capabilities and protocol. We can do this. But there are still some points that we need to consider. In some collect, client connection need to be reused without being closed and something else.
Hi, better design, 👍👍👍 I think this is a great abstraction of collect capabilities and protocol. We can do this. But there are still some points that we need to consider. In some collect, client connection need to be reused without being closed and something else.
Maybe creating a wrapper for CollectClient can help reuse connection. I will consider it as well.
#1791
Hi, @tomsun28 I've written a demo for what you mentioned before about http_sd in pr #1791
- Enter http_sd url(optional)
- Detect http_sd through one-time job
- Batch create monitor by http_sd
Is it the desired result?
Yeah, 👍👍 Geat! Can you sumbit a pr for this, now we are publishing new version, we can merge this after release.
Yeah, 👍👍 Geat! Can you sumbit a pr for this, now we are publishing new version, we can merge this after release.
I will create a pr tonight.