Include *.gz (for example) doesn't propagate through subdirs

Advertisement

Andy Haveland
Guest

Include *.gz (for example) doesn't propagate through subdirs

I collect a large number of weblogs on a linux server, which are automatically chopped and archived in monthly batches and lie in archive subdirectories beneath the log dirs.

I want to backup these dirs to a windows machine using winscp.

Criteria, new and updated files, *.gz, preserve dirs, date, ignoring current access_logs.

Ideally, I would like to use a command line option and/or schedule the job.

However, copying the log dir containing all these customer dirs and logs just skips because no *.gz files are found here, and winscp does not seem to traverse deeper.

Is this known?

Cheers,
Andy.

Reply with quote

Advertisement

martin
Site Admin
martin avatar
Joined:
Posts:
40,476
Location:
Prague, Czechia

Re: Include *.gz (for example) doesn't propagate through subdirs

You haven't read the documentation :-)
The inclusion/exclusion mask is applied to directories as well. As your directories probably do not have .gz extensions, thay are all excluded. You need to add mask "*/" to match them all (this is supported since 3.7.6 only), see the last sentence on the link above.

Reply with quote

Anton P.
Joined:
Posts:
29

Hi Martin,

The file masks documentation says, "Mask */ matches any directory. For example to transfer only HTML files located in any directory, use include mask */; *.html"

This include mask matches any subdirectory (at any depth), and matches all HTML files found at any depth. If we were copy remote files to the local machine, this mask results in subdirectories being created on the local machine which are empty because they didn't contain any HTML files on the remoth machine. Is there any way to avoid this?


Another question regarding masks: is there a wildcard which works a bit like * but covers arbitrary depth of subdirectories? So in the example above, if $ was such a wildcard, I would include all HTML files at any depth by specifying the include mask "$/*.html". (This would also avoid the issue discussed above.)

This would be useful, because, for example, at the moment I have a transfer preset which uses the remote directory mask "*/public_html; */public_html/*; */public_html/*/*" which applies to a remote directory (at any depth) called public_html and also to exactly one or two levels of subdirectory below such a directory. I would like to be able to say "*/public_html/$" so that it matches any subdirectory of public_html at any depth. Without this, I am forced to add a new mask of the form "*/public_html/*/*/...../*/*" every time I want it to apply to a deeper subdirectory.

(Note that the way of specifying a relative path---prefixing the path by */ as in "*/public_html"---is anomalous because * is not allowed to represent more than one piece (filepart or folderpart) of a path in any other situation than this. With a $ wildcard, the syntax could instead be "$/public_html".)

Thanks!
Anton

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
40,476
Location:
Prague, Czechia

Anton P. wrote:

This include mask matches any subdirectory (at any depth), and matches all HTML files found at any depth. If we were copy remote files to the local machine, this mask results in subdirectories being created on the local machine which are empty because they didn't contain any HTML files on the remoth machine. Is there any way to avoid this?
No :-(

Another question regarding masks: is there a wildcard which works a bit like * but covers arbitrary depth of subdirectories? So in the example above, if $ was such a wildcard, I would include all HTML files at any depth by specifying the include mask "$/*.html". (This would also avoid the issue discussed above.)
No. If I understand it right, you ask for the same thing as in the first question, just using different words. Right? :-)

This would be useful, because, for example, at the moment I have a transfer preset which uses the remote directory mask "*/public_html; */public_html/*; */public_html/*/*" which applies to a remote directory (at any depth) called public_html and also to exactly one or two levels of subdirectory below such a directory. I would like to be able to say "*/public_html/$" so that it matches any subdirectory of public_html at any depth. Without this, I am forced to add a new mask of the form "*/public_html/*/*/...../*/*" every time I want it to apply to a deeper subdirectory.
Maybe I'm missing something, but IMHO you can match anything below .../public_html using mask "*/public_html/*; */public_html/*/*". Shorter, but less precise form would be "*/public_html*/*".

(Note that the way of specifying a relative path---prefixing the path by */ as in "*/public_html"---is anomalous because * is not allowed to represent more than one piece (filepart or folderpart) of a path in any other situation than this. With a $ wildcard, the syntax could instead be "$/public_html".)
Technically, the last slash in mask (except for the trailing slash that indicates directory mask) separates the mask into two pieces: file mask and path mask. All other slashes are treated literally.

Reply with quote

Anton P.
Joined:
Posts:
29

martin wrote:

Anton P. wrote:

this mask results in subdirectories being created on the local machine which are empty because they didn't contain any HTML files on the remoth machine. Is there any way to avoid this?
No :-(

A pity. Never mind.

martin wrote:

If I understand it right, you ask for the same thing as in the first question, just using different words. Right? :-)

Well, the second question was a generalisation of the first. :-)

martin wrote:

Maybe I'm missing something, but IMHO you can match anything below .../public_html using mask "*/public_html/*; */public_html/*/*". Shorter, but less precise form would be "*/public_html*/*".

Indeed you're right. I thought I had the * figured out, but now I see that * really can match "chunks" of pathnames, not just individual (parts of) foldernames or filenames. It was the information about the last slash in mask (except for the trailing slash that indicates directory mask) being the one which separates the path mask from the file mask that made it all make sense.

So something like the $ wildcard is only needed if there's no other way of solving the first issue. (Not that this issue has bothered me personally.)

Cheers then,
Anton

Reply with quote

Advertisement

You can post new topics in this forum