Monday, September 21, 2020

Cleaning empty dirs with Powershell gci, hidden files like .DS_Store, and the power of the force

I saw this handy post recently using powershell's get-childitem (gci) to remove empty folders, which is great for truly empty directories with nothing in them. However, it didn't go into a few use cases where there are hidden or tiny files left in directories (hello mac SMB clients) that you might want to ignore when cleaning things up. I wanted to add on to that code as an exercise so read on for a few ways to solve this one if you don't have truly empty directories, but need to clean things up still.

I'm not a mac guy, but one thing I've come across is macs leaving behind little hidden or special files, in EVERY directory for some odd reason. (To be fair I guess PCs also do that as well with thumbnail cache files) These tiny files everywhere are painful for the system admin, as that adds to overhead for backups and devices serving the file systems, and in the above code example this can cause the directories to be seen as 'not empty' due to the index file or stub hanging out alone there.

Note: I'm on a linux host so the file paths are different, and using PS Core 7 in my testing as a mention. Also: Do you own code testing and validation of course, and feel free to use the below code at your own risk.

So to start with here's the original PS code from the link, which catches empty directories but would not catch a directory that has a single ".DS_Store" hidden file in it for example.
(gci “C:\dotnet-helpers\TEMP Folder” -r | ? {$_.PSIsContainer -eq $True}) | ?{$_.GetFileSystemInfos().Count -eq 0} | remove-item
The catch is that by default get-childitem (gci) doesn't return hidden files, which we don't care about initially perhaps, but later in the code you run a given directory through .getfilesysteminfos().count. When you use that getfilesystemsinfos().count it'll return the count of ALL files in the directory, which includes hidden files, which then would cause those directories found initially that are containing remnants or hidden .DS_store files to not be cleaned up/deleted. This is a problem if you have .DS_store files in every otherwise empty directory. The solution is to use the force.
"Don't underestimate the force" -Darth Vader
To begin with I just refactored the code into two lines to make it easier to read and work with later. I also updated it to catch the hidden files in the first gci by using the -force parameter. So after the first gci using -force it uses the list of all directories found, and will look through each directory returned to count and remove only directories with no files whatsoever, hidden or not. This is the same end-result as the first snippet, we're just using a second gci instead of the .getfilesysteminfos().count so we can have flexibility in the following examples. (you can remove the whatif to really delete directories, and add a -confirm:$false to not prompt if desired)

Delete only completely empty directories:
$dirs = gci "/home/dandill/test" -r -Force | ? {$_.PSIsContainer -eq $True} foreach ($d in $dirs){if((gci $d -force).count -eq 0){$d | remove-item -whatif}}
Then if we want to remove the directories without any regular files but that still contain hidden files we can do that like so. (note the lack of a -force on the second line, which then doesn't count hidden files in the logic to evaluate a directory on if it's considered 'empty')

Delete all directories with no regular files, and still delete them if they have any hidden files:
$dirs = gci "/home/dandill/test" -r -Force | ? {$_.PSIsContainer -eq $True} foreach ($d in $dirs){if((gci $d).count -eq 0){$d | remove-item -whatif}}
Lastly, if we had a bunch of empty directories, some with only hidden files named ".DS_Store" in them we could also delete those by specifically excluding them in the second gci with the -exclude parameter. Basically saying to look for all files in directories, but to not look at .DS_Store files for the second gci, and if you come up with zero results from that, then delete that directory.

Delete all directories that are completely empty, or contain only the ".DS_Store" file:
$dirs = gci "/home/dandill/test" -r -Force | ? {$_.PSIsContainer -eq $True} foreach ($d in $dirs){if((gci $d -exclude ".DS_Store" -force).count -eq 0){$d | remove-item -whatif}}
You can modify that exclude parameter in the second line with wildcards or as desired of course if you have other files hanging out causing your directories to not be considered to be empty.

So there you have it, use the -force if you want to clean up directories with PS and gci but may have a hidden file or two preventing that with defaults. Hope this is helpful or useful to you!

Thursday, July 30, 2020

Powershell - divide a list up by the number of days left in week

Here's a sample powershell code snippet for taking an input via a file containing a list and dynamically dividing that up based on the number of days left till Friday (in this example). This can be used if you want to action on a pool of given items throughout the week based on the number of days left in the week. We also assuming that the final day is the 'take the entire list' day as far as the list goes.

For example: You have a list of 500 machines to update each day throughout the work week, and it's Monday. So if you throw the 500 machine list at the below script you can action on the first 100 of that list(5 days between mon-fri). Run it Tuesday and it will action on the first 125 items in the list (including the 100 from monday). This is important to be aware of - The code below ideally needs the input list to be validated at run time as it has no concept of state - it's not handling the input list possibly changing mid-week, previous runs of the code, future runs to occur, etc. So you'll want to add in code at "#Here is where you might do qualification..." to suit your needs to remove items that were already affected by previous runs, or are not valid when the script is run. Alternately if you prefer or can run against items again, I've included one commented way to do that below. Either way, be aware that the code below without modification if used for our example of 500 items, will result in selection like this:

Day Items Selected
Mon 100
Tue 125
Wed 166.6
Thur 250
Fri(all) 500

So be sure to add in your validation code (line 36), or change the list selection method as desired for your purposes (line 49). Hopefully this is a helpful example should you need to use powershell to action on a portion of a given list evenly between now and an end date/time. Being powershell, this could be used against virtual machines, identities, or anything else you might want to spread out over a given time frame (without going to the extra effort of manually dividing up the list introducing human error).


# Example powershell code snippet for taking an input via a file with a list and
# dynamically dividing that up based on the number of days left till friday.
# Can be used if you want to action on a pool of given items throughout the week.
# Virtual machines, indentities, etc.
# This example code does not do a number of things which you might think about if 
# using it for production, here are a few:
#  -Validation on the list inputted (are all the values formatted as desired, and valid)
#  -In the same vein, without additional code, additional runs of the script will
#   result in the actions being taken against the same items in the list
#  -Sorting or processing of the list in any way other than how it is in the list ingested
#  -Better logging beyond plain write-host output
#  -Dependent modules (if you need to call them)
#  -Time zones - this just uses the local result of get-date on the host executing the
#   script, so if you're crossing datelines between the script host and affected items,
#   or even timezones you might consider that
#
# Code is given as an example. Feel free to use if it's helpful to you, but please do
# do your own dilligence and validation of course.

Param (
  #Script parameters go here, this in an input file
  [Parameter(Mandatory = $true)][string]$file
)

#Set Error Action to Silently Continue
$ErrorActionPreference = 'SilentlyContinue'

#Testing variables
$testing = $false

#Read your list of whatever for this week
$list = get-content $file
$totalcount = $list.count

#Here is where you might do any qualification, or drop items from the list that
#don't require action, were already actioned on by a previous run, etc.

#Store variable for today and figure out number of days till friday
$today = get-date
$fridaydate = Get-Date
while ($fridaydate.DayOfWeek -ne "Friday") {$fridaydate = $fridaydate.AddDays(1)}
$daysleft = (New-TimeSpan -Start $today -End $fridaydate).days

#Divide list count by days left till Fri and compile a list for today
$numbertoactionon = $totalcount / $daysleft
$numbertoactiontoday = $list[0..$numbertoactionon]

#Alternately if you wanted to assume mon-fri run once daily, and idempotency of
#your actions below then you could use something like this to select an addition
#one fifth of the total items each run/day:
#$numbertoactionon = ($totalcount/5) * (6-$daysleft)
#$numbertoactiontoday = $list[0..$numbertoactionon]


#If it's the last day - in this example Fri, then we're grabbing all the items left
if ($daysleft -le 1){
$numbertoactiontoday = $list
}

#Loop through today's items in the list and output to screen while taking action
foreach ($i in $numbertoactiontoday){
 if($testing){
 #Insert your test case code here
 Write-host "Testing $i"
 #Do some testing
 }else{
 #Insert your prod (non-test) code here to do whatever
 Write-host "Processing $i"
 #Do something for real
 #Sleep after each individual action, just to space things out over time if desired
 write-host "sleeping 60 seconds"
 start-sleep 60
 write-host "onward!"
 }
}

Wednesday, July 29, 2020

Terraform cloud region selection based on country

Here's a basic example of terraform code to automatically select a given cloud region based on a pre-configured mapping and the public IP of the machine running terraform. I will walk through the code pieces below:

The first piece of code sends a request to an internet accessible geoip provider. (I just picked one out there, there are many be aware of free restrictions if using in prod) This http request is made via the terraform http provider which then stores the JSON response given in data.http.geoipdata. You can pop open that URL in your browser to see the data it responds with.

# Grab current region info to use for auto choosing a region
data "http" "geoipdata" {
  url = "http://www.geoplugin.net/json.gp"

  # Optional request headers
  request_headers = {
    Accept = "application/json"
  }
}
Once we have gotten the response from that provider and it's told us which country we are in we can later use that to find a cloud region. We have a variable to control if this functionality is on or not and it's set to on by default

# Variable to enable auto region select
variable "enable_autoregion" {
  description = "If set to true, enable auto region choosing"
  type        = bool
  default     = true
}
We then build a map for which cloud region we will associate with which source IP's country. So for example, if the client is in Singapore (SG) or Malaysia (MY) then we want to use region "southeastasia" or for US clients, use "westus2"

# This maps country code to azure region
variable "regionmap" {
  type = map(string)
  default = {
    "SG" = "southeastasia"
    "MY" = "southeastasia"
    "US" = "westus2"
  }
}

Lastly, we bring it all together using a local value. Here we are saying, check if var.enable_region is true. If it is, then we take our response from data.http.geoipdata.body, decode it as jSON, grab the value it gave us for "geoplugin_countrycode", and then use that to match the value in our code above var.enable_autoregion. So the value returned/set for the local value for someone in "US" would be "westus2". Then at the end of the expression, if for some reason var.enable_autoregion is not set to true, then we will set local.localregion to "eastus" as a fallback.

# This matches the country code to to the map above to return the desired cloud region if the enable_autoregion var is true, otherwise defaults to eastus
locals {
localregion = var.enable_autoregion == true ? var.regionmap[(jsondecode(data.http.geoipdata.body)).geoplugin_countryCode] : "eastus"
}
So that can be then used in your code for the region, in a basic terraform-defined Azure RG, that looks like this:

# Set up terraform resource group
resource "azurerm_resource_group" "tfrg" {
        name = "yourrgname"
        location = local.localregion
}
Github code where I used this to stand up an azure instance is here.