Back To Top

February 17, 2024

Acquiring and Analyzing Earnings Announcements Data in Python

Identifying Today’s Stock Pattern in Historical Data Considering Current Market Conditions and Visualizing Future Movements in Python

Earnings announcements provide valuable insights into a company’s financial health and future prospects, swaying market sentiments and shaping investment strategies. However, the real challenge lies in analyzing these announcements effectively to extract actionable insights, coupled with the difficulty of accessing easily retrievable data.

In this article, we utilize open-source tools to access earnings information without incurring expenses. Furthermore, our objective is clear: to transform the earnings data into strategically valuable insights through comprehensive analyses, including volatility assessment, trend forecasting, and earnings surprise impact evaluation.

We will navigate through a series of Python-powered methodologies, starting from the initial step of fetching earnings data using Selenium, to advanced techniques involving data processing, visualization, and predictive analytics. We aim to offer readers a comprehensive off-the-shelf framework for technical earnings analysis.

This guide as structure as follows:

       1. Fetching Earnings Data
       2. Stock Prices and Earnings Surprise
       3. Price, Volatility and Volume Around Earnings
       4. Advanced Analysis — Historical and Future Movement Probabilities

1 Fetching Earnings Data

We use Selenium for web scraping, extracting earnings data directly from Yahoo Finance. This dynamic and cost-effective approach stands as an alternative to traditional financial data services.

1.1 Scraping Earnings Announcement Data

The Python code below uses Selenium to set up a headless Chrome browser, navigates to the Yahoo Finance earnings calendar page for a specified stock ticker, and systematically retrieves the earnings information. 

The data, encompassing elements such as symbol, company name, earnings date, EPS estimate, reported EPS, and the earnings surprise percentage, which is then parsed into a structured format. 

1.2 Data Processing and Cleaning

Currently, the cleaning process involves extracting and standardizing time and timezone information. We also convert dates to a consistent format suitable for time-series analysis.

Furthermore, this segment of the code cleans numerical data for further analysis. See the complete code for data fetching and cleansing below.

				
					from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import pandas as pd

def fetch_earnings_data(ticker):
    # Set up Selenium to run headlessly
    options = Options()
    options.headless = True
    options.add_argument("--headless")
    options.add_argument("--disable-gpu")
    options.add_argument("--window-size=1920x1080")

    driver = webdriver.Chrome(options=options)
    url = f"https://finance.yahoo.com/calendar/earnings?symbol={ticker}"
    driver.get(url)

    # Find the rows of the earnings table
    rows = driver.find_elements(By.CSS_SELECTOR, 'table tbody tr')

    data = []

    for row in rows:
        cols = row.find_elements(By.TAG_NAME, 'td')
        cols = [elem.text for elem in cols]
        data.append(cols)

    # Close the WebDriver
    driver.quit()

    # Assuming the data structure is as expected, create a DataFrame
    columns = ['Symbol', 'Company', 'Earnings Date', 'EPS Estimate', 'Reported EPS', 'Surprise(%)']
    df = pd.DataFrame(data, columns=columns)

    return df

# Example usage:
ticker = "SAP"
earnings_data = fetch_earnings_data(ticker)

# Extract the time and timezone information into a new column
earnings_data['Earnings Time'] = earnings_data['Earnings Date'].str.extract(r'(\d{1,2} [AP]MEDT)')

# Extract just the date part from the "Earnings Date" column
earnings_data['Earnings Date'] = earnings_data['Earnings Date'].str.extract(r'(\b\w+ \d{1,2}, \d{4})')

# Convert string date to datetime
earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'], format='%b %d, %Y')

# Convert datetime to desired string format
earnings_data['Earnings Date'] = earnings_data['Earnings Date'].dt.strftime('%Y-%m-%d')

earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)

earnings_data
				
			
Historical Earnings Data www.entreprenerdly.com

Figure. 1: A detailed snapshot displaying Historical Earnings Data extracted from Yahoo Finance, showcasing earnings dates, EPS estimates, reported EPS, surprise percentages, and corresponding earnings times.

2. Stock Prices and EPS

2.1 EPS Markers on Stock Price

By utilizing Python’s yfinance library, we fetch historical stock price data. Contrasting this data with previously retrieved EPS figures unveils the market’s response to earnings announcements.

The following Python snippet fetches historical stock prices within the time frame of available earnings data. It then overlays significant earnings surprises, both positive and negative, on a time series plot of the stock price.

				
					import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt

#earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)
#earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'])

# Fetch stock price data
ticker = 'SAP'
stock_data = yf.download(ticker, start=earnings_data['Earnings Date'].min(), end=earnings_data['Earnings Date'].max())

# Plotting stock data
plt.figure(figsize=(25, 7))
stock_data['Close'].plot(label='Stock Price', color='blue')

# Plotting earnings surprise
for index, row in earnings_data.iterrows():
    date = row['Earnings Date']
    # If exact date is not available, use the closest available date
    if date not in stock_data.index:
        date = stock_data.index[stock_data.index.get_loc(date, method='nearest')]
    
    if row['Surprise(%)'] > 0:
        color = 'green'
        marker = '^'
    else:
        color = 'red'
        marker = 'v'
    
    plt.plot(date, stock_data.loc[date, 'Close'], marker, color=color, markersize=15)

plt.title(f'{ticker} Stock Price with Earnings Surprise', fontsize = 13 )
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
				
			
eps stock price effect www.entreprenerdly.com

Figure. 1: A detailed snapshot displaying Historical Earnings Data extracted from Yahoo Finance, showcasing earnings dates, EPS estimates, reported EPS, surprise percentages, and corresponding earnings times.

2.2 The Price Effect of Earnings Announcements

2.2.1 Calculating Price Effect

A crucial aspect of earnings analysis is assessing the price effect, which measures the stock’s price movement before and after the earnings announcement.

The following code enriches the earnings_data DataFrame with new columns that capture the stock’s price before and after the earnings announcement, as well as the percentage change, which is the price effect.

				
					import yfinance as yf
import pandas as pd

# Assuming 'earnings_data' is the DataFrame and has an 'Earnings Date' column in string format
earnings_data['Earnings Date'] = pd.to_datetime(earnings_data['Earnings Date'])

# Now add buffer days to the start and end dates
buffer_days = 10
startDate = earnings_data['Earnings Date'].min() - pd.Timedelta(days=buffer_days)
endDate = earnings_data['Earnings Date'].max() + pd.Timedelta(days=buffer_days)

# Fetch SAP SE stock price data with additional buffer days
stock_data = yf.download(tickerSymbol, start=startDate, end=endDate)

# Fetch SAP SE stock price data
stock_data = yf.download(tickerSymbol, start=startDate, end=endDate)

# Function to compute price effect
def compute_price_effect(earnings_date, stock_data):
    try:
        # For "Price Before", if missing, we use the most recent previous price
        price_before = stock_data.loc[:pd.Timestamp(earnings_date) - pd.Timedelta(days=1), 'Close'].ffill().iloc[-1]
        
        price_on = stock_data.loc[pd.Timestamp(earnings_date), 'Close']
        
        # For "Price After", if missing, we use the next available price
        price_after = stock_data.loc[pd.Timestamp(earnings_date) + pd.Timedelta(days=1):, 'Close'].bfill().iloc[0]
        
        price_effect = ((price_after - price_before) / price_before) * 100
    except (KeyError, IndexError):  # in case the date is missing in the stock_data even after filling
        return None, None, None, None
    return price_before, price_on, price_after, price_effect

# Apply the function
earnings_data['Price Before'], earnings_data['Price On'], earnings_data['Price After'], earnings_data['Price Effect (%)'] = zip(*earnings_data['Earnings Date'].apply(compute_price_effect, stock_data=stock_data))

#earnings_data['Surprise(%)'] = earnings_data['Surprise(%)'].str.replace('+', '').astype(float)

earnings_data
				
			
Tabulated Earnings Data Showing EPS Estimates, Reported EPS, Surprise Percentage, and the Calculated Price Effect. www.entreprenerdly.com

Figure. 1: A detailed snapshot displaying Historical Earnings Data extracted from Yahoo Finance, showcasing earnings dates, EPS estimates, reported EPS, surprise percentages, and corresponding earnings times.

2.2.2 Stock Price, Effect Percentage, EPS Surprise

To further our analysis, we visualize the relationship between stock prices around the earnings announcement dates, the price effect percentage, and the EPS surprise. This is achieved through the following Python code which generates a multi-faceted bar and line plot.

				
					import pandas as pd
import matplotlib.pyplot as plt

#df = pd.DataFrame(data)
#df['Earnings Date'] = pd.to_datetime(df['Earnings Date'])

# Sort the dataframe by 'Earnings Date' in ascending order
latest_earnings_data = earnings_data.sort_values(by='Earnings Date').tail(14)

# Setting up the plot
fig, ax1 = plt.subplots(figsize=(30,8))

# Bar positions
positions = range(len(latest_earnings_data ))
width = 0.25
r1 = [pos - width for pos in positions]
r2 = positions
r3 = [pos + width for pos in positions]

# Clustered bar plots for prices
bars1 = ax1.bar(r1, latest_earnings_data ['Price Before'], width=width, label='Price Before', color='blue', edgecolor='grey')
bars2 = ax1.bar(r2, latest_earnings_data ['Price On'], width=width, label='Price On', color='cyan', edgecolor='grey')
bars3 = ax1.bar(r3, latest_earnings_data ['Price After'], width=width, label='Price After', color='lightblue', edgecolor='grey')

# Line plots for Surprise(%) and Price Effect (%)
ax2 = ax1.twinx()
ax2.plot(positions, latest_earnings_data ['Surprise(%)'], color='red', marker='o', label='Surprise(%)')
ax2.plot(positions, latest_earnings_data ['Price Effect (%)'], color='green', marker='o', label='Price Effect (%)')

# Annotations for the Surprise(%) and Price Effect (%)
for i, (date, surprise, effect) in enumerate(zip(latest_earnings_data ['Earnings Date'], latest_earnings_data ['Surprise(%)'], latest_earnings_data ['Price Effect (%)'])):
    ax2.annotate(f"{surprise}%", (i, surprise), textcoords="offset points", xytext=(0,10), ha='center', fontsize=16, color='red', fontweight='bold')
    ax2.annotate(f"{effect:.2f}%", (i, effect), textcoords="offset points", xytext=(0,10), ha='center', fontsize=16, color='green', fontweight='bold')

# Annotations for prices
def annotate_bars(bars, ax):
    for bar in bars:
        yval = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2, yval, round(yval, 2), ha='center', va='bottom', fontsize=14, rotation=45)

annotate_bars(bars1, ax1)
annotate_bars(bars2, ax1)
annotate_bars(bars3, ax1)

# Setting x-axis with better spacing
ax1.set_xticks(positions)
ax1.set_xticklabels(latest_earnings_data ['Earnings Date'].dt.strftime('%Y-%m-%d'), rotation=45, ha='right', fontsize=14)

# Setting labels and title
ax1.set_xlabel('Earnings Date', fontweight='bold')
ax1.set_ylabel('Price', fontweight='bold')
ax2.set_ylabel('Percentage (%)', fontweight='bold')
ax1.set_title('Earnings Data with Surprise and Price Effect', fontsize=18)

# Add legends
ax1.legend(loc='upper left')
ax2.legend(loc='upper right')

plt.tight_layout()
plt.show()
				
			
bar plot price effect www.entreprenerdly.com

Figure. 4: Combined Bar and Line Chart Illustrating the Stock Price Before, On, and After Earnings Dates Alongside Earnings Surprises and Price Effects.

2.3 The Price Effect and EPS Surprise Relationship

To examine the correlation between earnings surprises and subsequent price effects, we use a scatter plot with a fitted regression line. This method provides quantitative insights into the potential influence of surprising earnings figures on stock prices, informing analysts about the predictive power of earnings surprises.

				
					import matplotlib.pyplot as plt
import pandas as pd

# Drop rows with NaN values in 'Surprise(%)' and 'Price Effect (%)' columns
filtered_earnings_data = earnings_data.dropna(subset=['Surprise(%)', 'Price Effect (%)'])

# Linear regression
slope, intercept = np.polyfit(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'], 1)
x = np.array(filtered_earnings_data['Surprise(%)'])
y_pred = slope * x + intercept

# Compute r-squared
correlation_matrix = np.corrcoef(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'])
correlation_xy = correlation_matrix[0,1]
r_squared = correlation_xy**2

# Scatter plot with regression line
plt.figure(figsize=(30, 8))
plt.scatter(filtered_earnings_data['Surprise(%)'], filtered_earnings_data['Price Effect (%)'], color='blue', marker='o')
plt.plot(x, y_pred, color='red', label=f'y={slope:.3f}x + {intercept:.3f}')  # regression line
plt.title('Earnings Surprise vs. Price Effect', fontsize = 20)
plt.xlabel('Earnings Surprise(%)')
plt.ylabel('Price Effect(%)')
plt.grid(True)
plt.legend(loc="upper right")
plt.annotate(f'R-squared = {r_squared:.3f}', xy=(0.05, 0.95), xycoords='axes fraction', fontsize=15, color='green')
plt.show()
				
			
scatter plot earnings surprise vs price effect

Figure. 5: Scatter Plot Demonstrating the Relationship Between Earnings Surprise Percentages and the Subsequent Price Effect, with a Fitted Regression Line and R-squared Value.

3. Price, Volatility and Volume Around Earnings

3.1 Price Movement Around Earnings

The period surrounding earnings announcements is typically marked by heightened investor attention, often translating into significant price movements. 

To analyze these fluctuations, we normalize the yfinanceretrieved stock prices to the closing price five days before the earnings date to observe relative changes. 

The Python script below sets up the necessary parameters and iterates through each earnings date to create a time series of normalized prices. 

These are subsequently plotted to visualize price movements around earnings dates, providing a clear depiction of market behavior during these critical periods. 

Also Worth Reading:

Assessing Future Stock Price Movements With Historical & Implied Volatility

Price Movement Estimation Using Historical and Implied Volatility
Prev Post

Mining Patterns in Stocks with PCA and DTW

Next Post

Automating 61 Candlestick Trading Patterns in Python

post-bars
Mail Icon

Newsletter

Get Every Weekly Update & Insights

[mc4wp_form id=]

One thought on “Acquiring and Analyzing Earnings Announcements Data in Python

Stratagem Research

can u share the code of rthis section
4.2 Market Implied Movemement Probabilities
Moving beyond historical data, we can also utilize market-implied probabilities to gauge expectations about future stock price movements.

This involves analyzing option prices to extract the implied volatility, which reflects the market’s forecast of a stock’s potential to undergo significant price changes. Implied Volatility can be retrieved from the options chain prices on Yahoo Finance.

By applying this market-implied information in Monte Carlo simulations, we can predict a range of potential price outcomes and their associated probabilities for the period around an earnings announcement.

Reply

Leave a Comment