Exploring Gaia Star Clusters#
Learning Objectives
Query the Gaia Data Release 3 Database, which is a starting point for many investigations in modern astronomy
Interrogate the dataset to look for values that are missing, duplicated, or unreasonable. This is an important step not only for astronomical data, but for any large data set
Construct a color-magnitude diagram of stars in the starield.
Identify cluster members with proper motions.
Prerequisites#
These exercises assume:
Basic knowledge of python coding
Basic knowledge of RA/DEC cooridinates for finding star clusters
Basic knowledge of how color magnitude diagrams are generated. More information on Star Clusters from this website HR Diagrams and Cluster Evolution.
Introduction - Data Reduction#
In this activity, we will demonstrate how code can be used to query data from Gaia Data Release 3 to generate a color magnitude diagram. However, when it comes to extracting data from any database, it is important to be sure you are evaluating the data from your source. Often, data may have some values that do not make sense, such as a magnitude of 40 for a star. That value would be too faint for our telescopes, therefore that would skew the data. Second, sometimes measurements can’t be made and values such as -999 is point in its place. This would again skew the data, giving values that does not make sense.
We are going to look for stars at a specific cooridinate and do a radial search around that cooridinate. This poses another problem. We will be targeting stars both in our cluster and outside the cluster. This makes it difficult to generate the HR Diagram because this only works for stars in the same cluster which are at the same distance away. To identify stars in the cluster, will we target stars with the same proper motions. Using this fact, we can filter out the field stars and target the cluster starts. Once we are able to filter out those field stars, one can generate a HR Diagram of the cluster and begin analysis on the cluster. This activity will take you through these steps and validate with an animation that the cluster stars do move together with the same proper motion, as opposed to the field stars which vary in velocity.
Checkpoints#
Do you feel comfortable answering the following questions? Check your intuition by opening the dropdowns.
Checkpoint 1 - What would a -999 value mean in our photometry data?
A -999
would be indicative that there is no data, or bad data, for a particular star. The catalog uses this flag
to make it easy to identify and filter these sources out if desired.
Checkpoint 2 - What is the main challenge when extracting data for a star cluster?
A spatial query of stars in a region of sky will return both cluster stars and foreground/background stars that are along a similar line of sight but are not related to the cluster.
Checkpoint 3 - How we will we solve the challenge from Checkpoint 2?
Gaia also meaures proper motion, the (slight) movement of the stars in the plane of the sky over time. The cluster stars will all have similar proper motions as they are moving as one, the foreground/background stars will have random motions with respect to one another.
We can thus select the cluster stars as those sharing very similar proper motions.
Set up your Machine#
It is important to run the following installs for the necessary libraries. Whether you are using a virtual machine like Colab, or a non-virtual machine like jupyter notebooks, you must install and set up the following. Copy and paste the code into your terminal.
Specifically for Google Colab: When running the cell above, Notice that the infinity symbol at the top of the Colab turns gray, and turns back to yellow when it finishes. While it is running, click "Show code" to see what it is doing. The "install" statements are installing software on your virtual machine, and the "import" statements are importing functionality that you will need to run the code below. If you leave the Colab, Google might reallocate the virtual machine you are using. In this case, when you return, you would get a new virtual machine, which does not have the software installed. If something breaks, the infinity symbol at the top of the Colab will turn red and an error message will be displayed below the corresponding cell.
#@title Installs and auxiliary functions
%pip install numpy
%pip install astroquery
%pip install astropy
%pip install pandas
%pip install ipywidgets
%pip install matplotlib
from astroquery.gaia import Gaia
from astropy.table import Table
import astropy.units as u
from matplotlib import pyplot as plt
import numpy as np
from matplotlib.animation import FuncAnimation
from IPython.display import HTML, display
import pandas as pd
import ipywidgets as widgets
from matplotlib import rc
from astroquery.vizier import Vizier
def interactive_display_astropy_table(astropy_table):
"""
Interactive illustration of an Astropy table in a Jupyter Notebook using Pandas.
Parameters:
astropy_table (astropy.table.Table): The Astropy table to be displayed.
"""
# Convert Astropy Table to Pandas DataFrame
df = astropy_table.to_pandas()
# Create widgets for interactive display
search_widget = widgets.Text(
description='Search:',
placeholder='Type here...',
style={'description_width': 'initial'}
)
dropdown_widget = widgets.Dropdown(
options=['All'] + list(df.columns),
description='Filter by:',
style={'description_width': 'initial'}
)
output = widgets.Output()
def update_table(change):
with output:
output.clear_output()
if search_widget.value:
if dropdown_widget.value == 'All':
mask = df.apply(lambda row: row.astype(str).str.contains(search_widget.value, case=False).any(), axis=1)
else:
mask = df[dropdown_widget.value].astype(str).str.contains(search_widget.value, case=False)
display(df[mask])
else:
display(df)
search_widget.observe(update_table, names='value')
dropdown_widget.observe(update_table, names='value')
# Display the widgets and the table
display(widgets.VBox([widgets.HBox([search ```_widget, dropdown_widget]), output]))
update_table(None)
Query Gaia#
In the following exercise, you’ll query gaia
databases to retrieved the information we need thoughout this activity.
Exercise 1
Use the
astroquery
package to query the Gaia DR3 database and retrieve all stars within a 1 degree radius of “Messier 44” (the Beehive Cluster)
Hint: The astropy.coordinates.SkyCoord()
class has a convenient static method to retrieve coordinates by name: SkyCoord.from_name("name")
.
Visualize the table both using the built-in view (which by default will save screen space by only showing a few of the many rows), then run the provided interactive table function as shown below to get an interactive view.
Exercise 2
Answer the following questions in a sentence or two each.
How does the astronomical magnitude scale work? Are the stars in the list above faint or bright?
What are the units for each column?
Stars with high proper motion often have large parallax. Why might this be? Is that true for the data set above?
Look closely at the numbers. Do you notice anything strange, missing, duplicated, or unreasonable?
Exercise 3
Now adjust your code to query Gaia DR3 data for one of the following clusters: Pleides (M45), Owl (NGC 457), or Jewel Box (NGC 4755). You will work with this cluster for the remainder of the exercises.
Photometry & CMD#
We will now plot the Color-Magnitude Diagram for the photometry data you extracted from the star cluster you searched for in Gaia. You will notice the standard clustering of points indicating the HR Diagram for star clusters, however, you will also notice a fair amount of stars in the diagram that are far away from this relation. These are known as field stars. These are the stars in the same region of the sky as the cluster, but which are not actually part of the cluster.
Exercise 4
Plot a color magnitude diagram (CMD) of the table you retrieved from
gaia
. The Y-axis should be the"phot_g_mean_mag"
column (that’s the magnitude part), and the x-axis should be the"phot_bp_mean_mag"
column minus the"phot_rp_mean_mag"
column. (i.e., ; that’s the color part).
Note: A CMD and HR diagram provide similar information, though there are subtle differences. A “color” (how red or blue the stars are on average) is a good proxy for temperature, while the magnitude is a good proxy for luminosity (the two axes of an HR diagram). Technically, only the absolute (distance independent) magnitude is a direct proxy for luminosity, but because all the stars within the star cluster are at the same distance from us, their relative magnitude differences will still create the standard main sequence/giant branch/horizontal branch kind of shape.
Questions#
Exercise 5
Identify the cluster and field stars. Do you have more field stars than cluster stars? Do you have more cluster stars than field stars? Or is it hard to tell with your plot.
Where are most of the stars located on the plot? Do they have high luminosities or lower luminosities? Are they more red, blue or white?
Exercise 6
Bonus Question
Scientists were able to identify star cluster membership at least reasonably well before these proper motion data were available. How would you separate out field stars from cluster stars given only the data you’ve plotted so far?
Field Star Removal with Astrometry Plots#
Next we need to remove the field stars from our HR plot. We identify cluster stars through their proper motions. These proper motions in the RA and DEC direction will all be in the same if they are part of the cluster, as those stars are all moving (to first order) in the direction and at the velocity that the cluster as a whole is moving.
Exercise 7
The proper motion information for your stars is stored in the "pmra"
and "pmdec"
columns. We will also make use of the "parallax"
column, which gives an indication of distance.
Using a pmra range of [-39,-33], a pmdec range of [-16,-10], and a parallax range of [4,7], filter your table to only contain stars within those ranges.
Additionally, create a version of your table filtered for all stars outside of the ranges above in all three parameters. Hint: the
operator in Python will let you invert a filtering operation, i.e., it means “not X”.Plot a CMD for both your “cluster” filtered table and your “not cluster” filtered table. Describe each plot and what they show in terms of field stars and cluster stars.
Give at least two pieces of evidence from the plots to suggest that the cluster stars were well-determined, and explain why you choose those pieces of evidence.
Are there any outliers in your cluster filtered plot? What kind of stars could these be?
Visulization of Cluster Motion#
The code below will show how cluster members move within a star field. We used data from above to identify star cluster members through their proper motion. Here, this shows all of the star proper motions.
Questions#
Identify the cluster members in the animation - how can you distinguish between cluster members and field stars?
Describe the cluster member motion.
Describe your reactions to this animation. Did you expect this result or did the result surprise you and why?
def plot_cluster_dynamic(_table, _cluster_table, _field_table, _mf, _speed):
"""
Plot the cluster dynamically based on the proper motion.
Parameters:
_table (astropy.table.Table): The table containing the data.
_speed (float): The speed factor for the animation. sec/yr
"""
# Append the data
ra = _table["ra"]
dec = _table["dec"]
pmra = _table["pmra"]
pmdec = _table["pmdec"]
# range of ra/dec
_ra_range = max(ra) - min(ra)
_dec_range = max(dec) - min(dec)
_min_plot_ra = min(ra) - _mf * _ra_range
_max_plot_ra = max(ra) + _mf * _ra_range
_min_plot_dec = min(dec) - _mf * _dec_range
_max_plot_dec = max(dec) + _mf * _dec_range
# Set up the plot
fig, ax = plt.subplots(figsize=(10, 10))
ax.set_xlim(_min_plot_ra, _max_plot_ra)
ax.set_ylim(_min_plot_dec, _max_plot_dec)
ax.set_xlabel("RA (degrees)")
ax.set_ylabel("Dec (degrees)")
ax.set_title("Cluster with Proper Motion Animation")
# Create the scatter plot
cluster_ra = _cluster_table["ra"]
cluster_dec = _cluster_table["dec"]
ax.scatter(cluster_ra, cluster_dec, s=3, color="red")
field_ra = _field_table["ra"]
field_dec = _field_table["dec"]
ax.scatter(field_ra, field_dec, s=1, color="grey")
_pm_factor = 0.001 / 3600 # milli-arsec per year to degree per year
_delta_ra_cluster = _cluster_table["pmra"] * _pm_factor * _speed
_delta_dec_cluster = _cluster_table["pmdec"] * _pm_factor * _speed
_delta_ra_field = _field_table["pmra"] * _pm_factor * _speed
_delta_dec_field = _field_table["pmdec"] * _pm_factor * _speed
# Update function for the animation
def update(_i):
nonlocal cluster_ra, cluster_dec, field_ra, field_dec, ax
ax.clear()
ax.set_xlim(_min_plot_ra, _max_plot_ra)
ax.set_ylim(_min_plot_dec, _max_plot_dec)
ax.set_xlabel("RA (degrees)")
ax.set_ylabel("Dec (degrees)")
ax.set_title("Cluster with Proper Motion Animation")
ax.scatter(cluster_ra, cluster_dec, s=3, color="red")
ax.scatter(field_ra, field_dec, s=1, color="grey")
cluster_ra += _delta_ra_cluster
cluster_dec += _delta_dec_cluster
field_ra += _delta_ra_field
field_dec += _delta_dec_field
# Animate the plot
anim = FuncAnimation(fig, update, frames=100, interval=100, repeat=True)
# save as gif
# anim.save("cluster.gif", fps=10)
plt.close()
return anim
# Example usage
anim = plot_cluster_dynamic(table, cluster_table, field_table, 0.5, 2000)
# equivalent to rcParams['animation.html'] = 'html5'
rc('animation', html='html5')
anim
Calculate T_eff and Luminosity for cluster stars#
The code below will calculate the temperature and lumonisity of the cluster members to identify their characteristics. They will be plotted as well. The process to calculate these features are identified and commented out in the code.
Exercise 8
Read through the code. Write a paragraph describing the calculations as if you were explaining it to a friend. Describe how these calculations are done.
What are the magnitudes measured in in Gaia and what do we need to convert them to?
The T_eff and Luminosity graph using log to plot the information. How would you read this graph? What does the data say about cluster members?
Exercise 9
Implement the code to ultimately add a
Teff
andLuminosity
column to your cluster table. Plot these quantities in order to create a true HR diagram. (Don’t forget about the fact that temperature increases to the left in an HR Diagram, so you may need to reverse your axis.)