tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

filetail, log file move to history dir ,how to configure file generation and archive strategies

Open dongbin86 opened this issue 8 years ago • 6 comments

I have nginx log file to collect in "/data/logs/nginx/xyz.log" every day ,mv the log and compress to /data/logs/nginx/2017/05/xyz.log.tar.gz and recreate new log file /data/logs/nginx/xyz.log how to configure file generation and archive strategies ? i try use default "Active File with Reverse Counter " naming type ,but i found streamsets try to collect /data/logs/nginx/2017 ,but that is a dir

dongbin86 avatar May 04 '17 05:05 dongbin86

I'm not clear on what you want SDC to do. Do you want it to read the xyz.log.tar.gz file each day?

metadaddy avatar May 04 '17 21:05 metadaddy

no,i only need to collect current xyz.log but ,if I move the log to a dir that in the same dir , when multifilereader to refresh offset and recompute header hash ,they will coccur java.io.FileNotFoundException,becase ,that dir not exclude

dongbin86 avatar May 06 '17 06:05 dongbin86

I think LiveFile.java -> refresh() should change to this

if (changed) { try (DirectoryStream<Path> directoryStream = Files.newDirectoryStream(path.getParent())) { for (Path path : directoryStream) { if (!path.toFile().isDirectory()){ BasicFileAttributes attrs = Files.readAttributes(path, BasicFileAttributes.class); String iNode = attrs.fileKey().toString(); int headLen = (int) Math.min(this.headLen, attrs.size()); String headHash = computeHash(path, headLen); if (iNode.equals(this.iNode) && headHash.equals(this.headHash)) { if (headLen == 0) { headLen = (int) Math.min(HEAD_LEN, attrs.size()); headHash = computeHash(path, headLen); /*get file header content and compute md5 as hashvalue/ } refresh = new LiveFile(path, iNode, headHash, headLen); break; } }

    }
  }

dongbin86 avatar May 06 '17 08:05 dongbin86

if use filetail to collect logs ,then at the same log dir,not permit subdir exist, otherwise livefile refresh will throw exception, but in real product env , logs aways compress and mv to subdirs, so I think,if the file has been renamed in the same dir ,then fresh, but if the file has been deleted or mv to away ,refresh should return null

dongbin86 avatar May 09 '17 02:05 dongbin86

@sumpan Have you tried the above fix? Is it working for you?

metadaddy avatar May 10 '17 00:05 metadaddy

yes , I change some code ,and i works fine ,here's the pull request https://github.com/streamsets/datacollector/pull/27

dongbin86 avatar May 10 '17 04:05 dongbin86