Getting Files Placed in Folders Two Days' Later

Advertisement

rsford31
Joined:
Posts:
32

Getting Files Placed in Folders Two Days' Later

Hi,

I'm downloading zipped files where the folder structure is yyyy/mm/dd. The download is a daily process where the files have the current date in the timestamp but the names have the previous days' date. For example, the current date is July 21st. The date/timestamp will be July 21st, but the files will have 2020-07-20 in the file name and will be found in the 2020/07/20. Currently, with PowerShell, I'm getting the current date-1 and using those values to build the source location to pull the files. I pull the files down locally, unzip the files, change the extension to .csv, and then move the whole 2020/07/20 folder to a file server. I recently found out that another file is placed in the folder two days later. So using the same example, a file with the 2020-07-20 in the file name will be placed in the 2020/07/20 folder on the 22nd and will have a date/timestamp of the 21st.

I orginally thought that syncing the files with the FTP server directly to the file server would work, but given the files will change (unzipped and the extension changed), there would always be two files in the folder: the zipped file and the CSV file. I don't want to keep a set of files on the local server just to sync with either. I then thought that using time constraint feature would work. set the FileMask property to "*>=1D", the source to /2020/07/*, /2020/07/19, 2020/07/19/*, and 2020/07/19/, but the files are not downloading.

Ideally, I would like to be able to sync files locally that are new, unzip and rename, then push them to the file server. This way any files that are placed in older folders will not be missed. I like that the sync functionality creates folders at destination if they do not exist. I'm not sure if this is possible or not with the functionality available in WinSCP.

Here is a code snippet:
$transferOptions = New-Object WinSCP.TransferOptions
$transferOptions.FileMask = "*>=1D"
$session.GetFiles("E:\temp\", "/2020/07/19/", $False, $transferOptions)
I'm using WinSCP v 5.15.9
Let me know! Thanks!

Reply with quote

Advertisement

martin
Site Admin
martin avatar
Joined:
Posts:
41,518
Location:
Prague, Czechia

Re: Getting Files Placed in Folders Two Days' Later

rsford31 wrote:

Ideally, I would like to be able to sync files locally that are new, unzip and rename, then push them to the file server. This way any files that are placed in older folders will not be missed. I like that the sync functionality creates folders at destination if they do not exist. I'm not sure if this is possible or not with the functionality available in WinSCP.
You can use Session.SynchronizeDirectories to fetch the new files. And then iterate the list of downloaded files in SynchronizationResult.Downloads – extracting/renaming/uploading them.
https://winscp.net/eng/docs/library_session_synchronizedirectories
https://winscp.net/eng/docs/library_synchronizationresult

Reply with quote

rsford31
Joined:
Posts:
32

Re: Getting Files Placed in Folders Two Days' Later

Hi,
That would work but since I'm:
-Unzipping each file
-Renaming the extension from .out to .CSV
-PGP encrypting one of the files
So since, the original zipped files will no longer exist at destination, it will always re-download the files everyday. That's why I was hoping for a way to loop through all the files at destination and pull only the newest ones along with their folder structure.

Reply with quote

martin
Site Admin
martin avatar

Re: Getting Files Placed in Folders Two Days' Later

So can you give us clear rules to identify "only the newest ones" given your file structure?

Reply with quote

rsford31
Joined:
Posts:
32

Re: Getting Files Placed in Folders Two Days' Later

Sorry...I know the subject is misleading. Folders are set up as YYYY\MM\DD. There are about 10 files, for each day and each file is zipped. Each file has the current date in the filename. For example, filename_20200827.gz, filename1_20200827.gz, etc which is placed in 2020\08\27; filename_20200828.gz, filename1_20200828.gz, etc is placed in 2020\08\28 and so on. Sometimes, the vendor will create a new file for each day in the past and then place it in their respective folders. So for example, filenameX_*.gz and will create a file for each day in the past i.e. 2020\05\01\filenameX_20200501.gz; 2020\05\02\filenameX_20200502.gz; 2020\05\03\filenameX_20200503.gz....2020\08\027\filenameX_20200827.gz. Sometimes it might just be a new monthly file placed in the last day of the month's folder or it could be daily file. Even though the files will have a past date in the filename, their date/timestamp will be the previous day. Right now the transfers are occurring daily.
At destination, the users want the folders to mimic what is at source. What I was hoping was for a way to iterate through the folders at source and pull new ones down to destination.
What I've ended up doing is just using the sync to a local folder then using Powershell to get the most recent files, removing the older files/folders and just working with the newest ones i.e. unzipping, renaming, encrypting, etc.
It definitely would've been simpler if the files were in one folder but unfortunately this isn't the case.

Reply with quote

Advertisement

martin
Site Admin
martin avatar

Re: Getting Files Placed in Folders Two Days' Later

I'm not sure I see the answer to my question in your post. What files do you want to download? Those that have the current day in their filename?

Reply with quote

rsford31

Re: Getting Files Placed in Folders Two Days' Later

The files with the current day's date/timestamp. A file could be created today for April 4th. The file name would be filename_20200404.gz and placed in 2020\04\04 but with today's date/timestamp. So I would need away to traverse through all the folders under 2020 to find that file or any other file that was created with today's date/timestamp.
If it had the current date in the file name that would make my life so much easier but that is not the case.

Reply with quote

martin
Site Admin
martin avatar
Joined:
Posts:
41,518
Location:
Prague, Czechia

Re: Getting Files Placed in Folders Two Days' Later

That's what your original code should do. Except that you have the paths backwards. It should be:
$transferOptions = New-Object WinSCP.TransferOptions
$transferOptions.FileMask = "*>=1D"
$session.GetFiles("/2020/07/19/*", "E:\temp\", $False, $transferOptions)

Reply with quote

rsford31
Joined:
Posts:
32

Re: Getting Files Placed in Folders Two Days' Later

From what I've read and without trying it, I wouldn't be able to do this though, correct? I would have to dynamically build the folder structure and loop through them from 2020/01/01 to the current date in the event files have been added at any point during the year, correct?
$transferOptions = New-Object WinSCP.TransferOptions
$transferOptions.FileMask = "*>=1D"
$session.GetFiles("/2020/*", "E:\temp\", $False, $transferOptions)

Reply with quote

Advertisement

martin
Site Admin
martin avatar

Re: Getting Files Placed in Folders Two Days' Later

I'm lost. Why shouldn't you "be able to do this"?
As I wrote in my previous post: If you correct the paths, the code should do what you want.

Reply with quote

Guest

Re: Getting Files Placed in Folders Two Days' Later

Sorry...I was thinking of the sync not designed to traverse through folders. I did some testing and just some clarification, the files started generating back in March 8th. When I run the code as

$transferOptions = New-Object WinSCP.TransferOptions
$transferOptions.FileMask = "*>=1D"
$session.GetFiles("/*", "e:\temp", $False, $transferOptions)

I get a folder for each day of each month. I have folders for e:\temp\03\08 all they way to e:\temp\09\08. Some folders contain the zipped files since they a have a date/timestamp of *>=1D, while others are completely empty. It doesn't download the 2020 folder i.e. e:\temp\2020\03\08 to e:\temp\2020\09\08 but creates a folder for each mm\dd from March 8 to present under the temp folder. The only folders that contain files are those that have a date/timestamp of >=1day.

What I was looking for was to only download any files with the date/time of >=1D as well as it's parent folders so I wouldn't get empty folders. Just the folder containing files the the applicable timestamp. So if the vendor generated a new monthly report generated on the first of each month, I would see new_file_YYYYMMDD.gz in all the past folders. When the script is run, the previous days' files would be downloaded (previous days files are available the current day), then the monthly file in each folder for the 1st. For example:
2020\09\08 - regular files
2020\09\01 - new monthly file
2020\08\01- new monthly file
2020\07\01- new monthly file
....
2020\03\01- new monthly file

Those would be the only folders/files downloaded and not empty files for each day previous; however, you've provided me with lots of information and possible ways to achieve what I need. Thanks!

Reply with quote

Advertisement

You can post new topics in this forum