{"id":792,"date":"2020-05-12T17:28:06","date_gmt":"2020-05-12T17:28:06","guid":{"rendered":"http:\/\/jackofalltradesmasterofsome.com\/blog\/?p=792"},"modified":"2020-05-12T17:28:08","modified_gmt":"2020-05-12T17:28:08","slug":"robotic-python-automation-automating-working-with-files-with-python-cleaning-sorting-and-archiving","status":"publish","type":"post","link":"https:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/05\/12\/robotic-python-automation-automating-working-with-files-with-python-cleaning-sorting-and-archiving\/","title":{"rendered":"Robotic Python Automation &#8211; Automating Working with Files with Python. (Cleaning, Sorting and Archiving)"},"content":{"rendered":"\n<p>Robotic Python Automation Files Cleaning Sorting and Archiving.<\/p>\n\n\n\n<p>Sometimes we have folders where files are saved that get out of control. Sometimes we have Excel documents or Work files that pile up and we need an efficient way to sift and sort through them and either archive files or delete them. Using Python, we can build a script that automatically completes two tasks. <\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Looks for files of a certain name and archives them<\/li><li>Looks for files older than a certain data and deletes them <\/li><\/ol>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Setup<\/h2>\n\n\n\n<ol class=\"wp-block-list\"><li>Create a new Jupyter Notebook and name it Document_Automation<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"166\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image.png\" alt=\"\" class=\"wp-image-794\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-300x80.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Create a new folder on your C Drive \u201cC:\\Python\\Automation\u201d and seed it with 10 empty files that follow the naming conventin file_YYYYMMDD and log_YYYYMMDD. Create two folders called \u201cFiles\u201d and \u201cLogs\u201d<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"261\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-1.png\" alt=\"\" class=\"wp-image-795\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-1.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-1-300x125.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Setup<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li><\/li><li>The following command will list out all text files located in the folder you are searching for and place the date in to a list called \u201cmylist\u201d. The \u201c*\u201d is a wildcard that allows you to search for all character combinations that end with \u201c.txt\u201d. This can be updated to look for other file types<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>import glob\nmylist = glob.glob(\"C:\/Python\/Automation\/*.txt\")\nprint(mylist)<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"100\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-2.png\" alt=\"\" class=\"wp-image-796\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-2.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-2-300x48.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Next we can try to create our folders where we will organize our files. Since we already created them, the try\/catch will fail but if you had deleted them you will see them reappear. The OS and shutil need to be imported for these commands to work<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>import os, shutil\ntry: \n    os.mkdir(\"C:\/Python\/Automation\/Files\")\n    os.mkdir(\"C:\/Python\/Automation\/Files\")\nexcept OSError:\n     print (\"Folder Creation Failed\")\nelse:\n    print (\"Folder Creation Success\")\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"129\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-3.png\" alt=\"\" class=\"wp-image-797\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-3.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-3-300x62.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Start with a loop to go through your list<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>for i in mylist: \n     print(i)<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"442\" height=\"208\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-4.png\" alt=\"\" class=\"wp-image-798\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-4.png 442w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-4-300x141.png 300w\" sizes=\"auto, (max-width: 442px) 100vw, 442px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Lets update the look to search for any \u201clog\u201d\nfirst. This can be done with a string <a href=\"https:\/\/www.tutorialspoint.com\/python\/string_startswith.htm\">startswith()<\/a>\ncommand or using the \u201c<a href=\"https:\/\/www.afternerd.com\/blog\/python-string-contains\/\">in<\/a>\u201d command with\nan If statement. Both work in the case but be careful if your file naming rules\ncan change. We will use the latter example for the rest of the tutorial.<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>for i in mylist: \n    if (i.startswith('C:\/Python\/Automation\\log_')):\n        print(i)\n\nfor i in mylist: \n    if (\"log\" in i):\n        print(i)\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"208\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-5.png\" alt=\"\" class=\"wp-image-799\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-5.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-5-300x100.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Now move the files found in the search to the destination folder using the <a href=\"https:\/\/www.geeksforgeeks.org\/python-shutil-move-method\/\">Move<\/a> command. We need to specify the source location and destination location so we use <a href=\"https:\/\/www.tutorialspoint.com\/python\/string_replace.htm\">Replace<\/a> string manipulation to create both of those strings to pass to the function. Once run, you will see the files moved to the respective folders. <\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>for i in mylist: \n    if (\"log\" in i):\n        #print(i)\n        source = i\n        destination = i.replace('C:\/Python\/Automation', 'C:\/Python\/Automation\/Logs') \n        shutil.move(source, destination)<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"91\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-6.png\" alt=\"\" class=\"wp-image-800\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-6.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-6-300x44.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"181\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-7.png\" alt=\"\" class=\"wp-image-801\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-7.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-7-300x87.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>We can now do the same with the files titled \u201cFiles\u201d<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>for i in mylist: \n    if (\"file\" in i):\n        #print(i)\n        source = i\n        destination = i.replace('C:\/Python\/Automation', 'C:\/Python\/Automation\/Files') \n        shutil.move(source, destination)\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"265\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-8.png\" alt=\"\" class=\"wp-image-802\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-8.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-8-300x127.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"> Cleaning out old Files <\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Finally, we want to delete any old files. We can\ndo this using the remove and time functions in Python to identify the last modified\ntime of the file. In this sample 1800 is 30 minutes in seconds. This can be\nadjusted to make the window longer or a few days. We need to import time and sys\nto be able to the correct commands. Uncomment the If statement to use parameters\non the time duration.<\/li><\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>import os, time, sys\n\npath = r'C:\/Python\/Automation\/Logs'\n\nnow = time.time()\n\nfor f in os.listdir(path):\n    #if os.stat(os.path.join(path,f)).st_mtime &lt; now - 7 * 86400:\n    if os.stat(os.path.join(path,f)).st_mtime &lt; now:\n        print(\"Removing File\")\n        os.remove(os.path.join(path,f))<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"259\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-9.png\" alt=\"\" class=\"wp-image-803\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-9.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/05\/image-9-300x125.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p>Check out my these Udemy Classes below to get started in Python<\/p>\n\n\n\n<p>And Other articles here below.<\/p>\n\n\n\n<p>Robotic Python Automation Files Cleaning Sorting and Archiving<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Robotic Python Automation Files Cleaning Sorting and Archiving. Sometimes we have folders where files are saved that get out of control. Sometimes we have Excel documents or Work files that pile up and we need an efficient way to sift and sort through them and either archive files or delete them. Using Python, we can [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":807,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[97,94],"tags":[8,98],"class_list":["post-792","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-automation","category-python","tag-automation","tag-python"],"_links":{"self":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/792","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/comments?post=792"}],"version-history":[{"count":12,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/792\/revisions"}],"predecessor-version":[{"id":816,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/792\/revisions\/816"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media\/807"}],"wp:attachment":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media?parent=792"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/categories?post=792"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/tags?post=792"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}