Sorting multiple lists

Sorting multiple lists

In this post, I am gonna share with you guys how to sort multiple lists based on a list. This post is sort of a continuation post, or should I say a higher difficulty of understanding, of the previous post so if you guys haven’t check it out, please do so.

(WARNING: Long post ahead)

Previously, Han shared about the sorted() method from python for a single list of items, but what if multiple lists of items were given and you are required to sort these lists for whatever reasons. How are you gonna do it without jeopardizing the lists’ hierarchy? In this post, I am gonna share with you two ways (if there is any other approach, feel free to let us know through the comments below), that we discovered, on how to do so.

For this post, I am going to use a simple data structure whereby it grouped by name, age, sex, and height stored in excel. The table below illustrates part of the data, 2000+ rows were used to simulate real-life data application.

NameAgeSexHeight
Adele51Female168
Lindsey85Female156
Jeannette23Female165
Minerva19Female173
Ronald99Female175
Garth83Female156
Merrill4Male154
Lydia83Male161
Marcus30Male154
Sheila66Male175
Keith14Male155
Boris82Female152

In order to access the data in excel, we can simply use the nodes as shown in the picture below, which is self-explanatory:

But from the output of “Excel.ReadFromFile” node, we can see that the first list of data is the headings that are not required in this application. Hence, we will be dropping the first item before proceeding as shown in the picture below.

Now now, what if you are required to sort the list based on names? You can do a list transpose to group them according to the headers, BUT whats next? How to go about it? From here, I will be showing you guys 2 types of methods that can be used to sort multiple lists without jeopardizing the sequential order of the list.

1. Using dictionary

The first method is to of course use a dictionary. This approach was inspired by Han. (Thank you Han). By utilizing the key pair value function of a dictionary, we are able to map the names to the respective data accordingly. For those of you who are not familiar with dictionary in python and want to know more about how to do it, drop a comment down below and I will write a post about dictionary in python 🙂

Back to the topic, we can simply just feed in the data to the python script and use some dictionary and for loops in python codes to sort the list accordingly. The output/python script is shown below.

# Import python library
import sys
pyt_path = r'C:\Program Files (x86)\IronPython 2.7\Lib'
sys.path.append(pyt_path)
import os
import shutil
import math
# Import math library
from math import *

#Import Data and Time
from datetime import datetime
now = datetime.now()

#Import System Library
import System
from System.Collections.Generic import *
from System.IO import Directory, Path

#Preparing input 
datas = IN[0]

dict_name= {data[0]:data[0] for data in datas} 
dict_age= {data[0]:data[1] for data in datas} 
dict_sex= {data[0]:data[2] for data in datas} 
dict_height = {data[0]:data[3] for data in datas}

names,ages,sexs,heights = [],[],[],[]

for key in sorted(dict_name.keys()):
	names.append(dict_name[key])
	ages.append(dict_age[key])
	sexs.append(dict_sex[key])
	heights.append(dict_height[key])

#Final output
OUT = names,ages,sexs,heights,datetime.now()-now

As you guys can see, it is as simple as that. Now to explain what is going on in the python script. Basically, we are just adding keys and values to the dictionary, with the names as the key and other data as values. From there, we sorted one of the dictionary’s keys and initiate a for loop at the same time. So in each sorted keys, we will retrieve the values of each dictionary using each sorted keys and append it to each individual list. That is all, as simple as ABC right?

2. Using zip()

For the second method, which I feel is a much cleaner and leaner but higher difficulty method, we will be using both the zip() and sorted() method concurrently to achieve the same result as above.

Similarly to the dictionary method, we will just feed in the data to the python script and vola, we are able to achieve the same results with the output and python script shown below.

# Import python library
import sys
pyt_path = r'C:\Program Files (x86)\IronPython 2.7\Lib'
sys.path.append(pyt_path)
import os
import shutil
import math
# Import math library
from math import *

#Import Data and Time
from datetime import datetime
now = datetime.now()

#Import System Library
import System
from System.Collections.Generic import *
from System.IO import Directory, Path

#Preparing input 

datas = zip(*IN[0])

names = datas[0]
ages = datas[1]
sexs = datas[2]
heights = datas[3]

sortedlist = zip(*sorted(zip(names,ages,sexs,heights), key=lambda x :(x[0])))
sortedlist.append(datetime.now()-now)

#Final output
OUT = sortedlist

As you all can see, the code is much lesser (in terms of lines) and if more columns are added (let’s say more variables like school, eyesight, etc etc), the zip() and sorted() method python code will definitely be much more lesser as compared to the dictionary method.

Now to explain the code. Basically zip(iterable, iterable, iterable, iterable) and zip(*iterable, iterable, iterable, iterable) were used in the python code above. The main purpose of zip() is to pair the list up in parallel direction whereas zip(*) is just the inverse of zip(), basically “unzipping” the list. If you all still doesn’t know what I mean, the zip() and zip(*) method basically serve the same function as List.Transpose node in Dynamo. With sorted() explained previously, we are basically using x as our identifier and using x[0],which is the ‘names’ that we zip() in the iterable (x[1] = ages, x[2] = sexs, x[3] = heights), to sort our list accordingly. After sorting the list, zip(*) is used to transform the data back to the corresponding list structure as shown in the video above.

Well, of course, the unzipping part is not compulsory, it is dependant on what type of output you want. As you guys can see from the “timer” in both the videos, zip() method is faster than the dictionary method (although in this case is only by a fraction of milliseconds. But what if millions of data are supplied? Well that is for you guys to try out and tell us more about it :))

Anyway, I will be giving you guys some tips that I always used to decipher long lines code :

  1. Do not panic, obviously. If you panic, you will just throw that code away and not read it.
  2. Always read from the most inner loop to the outermost loop, to do that, you can press enter at the start of any method and read from bottom to top.
  3. Methods are called with (), for example sorted(), zip(), list(), etc.
  4. Long lines of code are usually just made up of multiple methods, nothing more. As long as you are able to locate the method from the most inner loop to the outermost loop, I can guarantee that that line of code can be deciphered easily.

And that is the end of the post, if you have doubts or queries, feel free to drop a comment down below or email to either me or Han. As always, the downloadable files are located below and happy coding~

Download links:

Leave a Reply