Splitting up a big CSV file to multiple smaller CSV files using PowerShell

Sometimes you have the problem that you get large Excel or CSV lists to work with but you want to split them up by a certain criteria, e.g. company. Your goal may be to be able to create multiple Exchange Hybrid migration batches.

To guide you through this task, my example data looks like this:

Firstname;LastName;Country
Max;Mustermann;Germany
Maria;Musterfrau;Germany
Igor;Smirnow;Russia
John;Dow;USA
Felix;Frisch;Germany

As you can see, it doesn’t matter how the data is sorted.

To get started you must specify some parameters to control the behaviour how PowerShell will split the file:

$GroupField = "Country"
$Delimiter = ";"
$csv = "C:\tmp\MyBigCSVWithTonsOfData.csv"
$outfolder = "C:\tmp\CSV-Files\"

What do these parameters do? It’s simple:

  • $GroupField specified which CSV column will taken as identifier for the split process.
  • $Delimiter specified the delimiter you are using in your CSV.
  • $csv is the full path to the CSV file you want to split up in multiple smaller files.
  • $outfolder is the folderpath where the generated CSV files will be stored.

After the preparation you must read the CSV and prepare the data to be splitted:

$all = Import-Csv $csv -Delimiter $Delimiter
$groupedCollection = $all | Group-Object $GroupField

You see I use the command “Group-Object” which acts like the Excel function “Filter”. With this command you can group structured data and the result are multiple groups which contain the single entries. So if your CSV input file contained 5 entries, myou group by country and 3 of the people have “Germany”, 1 has “USA” and one has “Russia” the result of these 2 lines will be a return value of 3 groups:

  • One group with 3 entries – the people with country “Germany”
  • One group with 1 entry – the person with country “USA”
  • One group with 1 entry – the person with country “Russia”

PSGroup

Having a deeper look at the return value, you can see that the variable $groupcollection is an array. You can check this by executing this command:

$groupCollection.GetType()

Working with an array is easy, you probably know it from other scripts you wrote with PowerShell. Just create a foreach-loop and iterate through the items:

foreach($group in $groupedCollection)
{
   Write-Host ("Group '" + $group.Name + "' // " + $group.Count.ToString() + " Members")
   $group.Group | ConvertTo-Csv -NoTypeInformation -Delimiter "," | Out-File ($outfolder + $group.Name + ".csv")
}

And voila, the result is that you have 3 CSV files – one for each country:

PSGroup2

Now you can continue your work with the new smaller CSV files. As you may have noticed, I’ve hard-coded the output files with delimiter “,”.  Of course, if you need another delimiter, feel free to adjust it to your needs.

So after all that you can copy/paste the whole tiny script at once here:

$GroupField = "Company"
$Delimiter = ";"
$csv = "S:\CSV-Files\MyBigCSVWithTonsOfData.csv"
$outfolder = "S:\CSV-Files\"

$all = Import-Csv $csv -Delimiter $Delimiter
$groupedCollection = $all | Group-Object $GroupField

foreach($group in $groupedCollection)
{
   Write-Host ("Group '" + $group.Name + "' // " + $group.Count.ToString() + " Members")
   $group.Group | ConvertTo-Csv -NoTypeInformation -Delimiter "," | Out-File ($outfolder + $group.Name + ".csv")
}

Published by Andreas

Founder of M365 Evangelists Cloud-Architect, Strategy Consultant, Consultant for Microsoft technologies, Graph API enthusiast, PowerShell enthusiast

One comment on “Splitting up a big CSV file to multiple smaller CSV files using PowerShell”

Leave a Reply