Analyze the Messier Catalog

Analyze the Messier Catalog#

Yasmeen Asali, William Cerny, Pratik Gandhi (Yale University)

Description: Use the Messier catalog to practice using for loops and and logic

Intended Audience: Beginner Undergraduate

tags: libraries:numpy, loops, program-flow, logic

Requirements: requirements.txt

Last Updated: July 18, 2025

Learning Objectives

  1. Understand the structure of 2D arrays and how to access data.

  2. Practice using logic to downselect rows from an array and practice using for loops to iterate through objects.

In this assignment, you will practice using logic and loops using a dataset of objects from the Messier Catalog. The dataset includes basic information about each object, such as its type, magnitude, distance, constellation, and best viewing season. Your tasks will involve reading in the data, analyzing it, and using numpy, logic, and loops to answer questions about it!

Opening the Dataset#

The data is stored in a .npy file, which you can load with np.load(). You can download it here (right click and save as a .npy file). You can use the following line of code to open up the file in your code and store its contents in a variable called data.

data = np.load('/Users/username/Downloads/messier_data.npy')

Caution

Will the above line run on your computer? No! Do you remember what you need to change about it?

Each row in the dataset corresponds to a single Messier object, with the following fields:

  • Messier: Name of the Messier object as a string (e.g., 'M107', 'M108')

  • RA and DEC: the Right Ascension and Declination of the object (coordinates in the sky)

  • Type: Type of object (e.g., 'Gc' for Globular Cluster, 'Sp' for Spiral, 'Ba' for Barred Spiral)

  • Mag: Magnitude (brightness) of the object. Magnitudes are a unitless system, and lower numbers mean brighter objects.

  • Distance: Distance from Earth in units of light-years

  • Constellation: The constellation in which the object resides

  • Season: The best viewing season (spring, summer, autumn, winter)

Here are some tips and reminders for using large 2D datasets:

  1. You can access each row of the dataset using indexing. For instance, data[0] will return the first row of the array (aka all of the above column values for a single Messier object).

  2. You can access each column of the dataset using data['Messier']. For example, rather than indexing by an integer, we are now indexing by a column key (a column name). This will return a numpy array of all the Messier numbers for all of the objects.