Back To Top

May 11, 2024

How to Track the Portfolio Allocation of Institutional Investors

Accessing and Analyzing Form 13F Reports: Using Python with FMP API and Selenium for Data Retrieval

Large institutional investors manage large sums of money and often have access to resources and information that individual investors do not. By tracking their portfolio changes, we can gain insights into which stocks are gaining attention or losing favor.

Institutional investors managing assets over $100 million must disclose their portfolio holdings quarterly through filings like the SEC’s Form 13F.

Navigating the maze of regulatory reporting and extracting meaningful insights from the raw data can be a challenging endeavor. This article outlines a methodical approach to tracking these portfolios, using public data sources and analytical tools.

This article is structured as follows:

  • Getting Structured Data via Public Resources or using the FMP API
  • Tracking Portfolio and Asset Holdings Changes Over Time
  • Buy and Sell Activity by Asset and Investment Companies
  • Performance Evaluation over Time

1. Retrieving SEC 13F Fillings Data

1.1 Manual Data Access Steps

Manually extracting data from SEC filings is often tedious and time-consuming. However, this process is crucial to comprehend the structure and content of the reported information. Here are the steps.

Step 1. Search for a Company in SEC Edgar Database

Navigate to the SEC EDGAR website (www.sec.gov/edgar.shtml) and use the search function to find the company or institution of interest. Enter the company name or its Central Index Key (CIK) to locate its filings.

Step 2. Explore the Latest 13F Fillings and Click on ‘Filling’

Once you’ve found the company, look for the latest 13F filing. These filings are typically labeled as ‘Form 13F-HR’ or ‘Form 13F-HR/A’ (for amended filings). Click on the ‘Filing’ link to access the detailed report.

Figure. 1: Comprehensive View of 'BERKSHIRE HATHAWAY' Filings. Source: SEC EDGAR Database.

Step 3. Click on the HTML Version of the Information Table

The 13F filing will typically have multiple document types, including the main report as well as any exhibits. 

Look for the ‘Information Table’ and click on the HTML link to view the portfolio holdings in a more user-friendly format.

Figure. 2: 'BERKSHIRE HATHAWAY' 2023–11–14 13F fillings. Source: SEC EDGAR Database.

Step 4. Voila! Read and Analyse the Portfolio Holdings

The HTML version of the Information Table will present the detailed portfolio holdings, including the securities, shares held, market value, and percentage of the total portfolio. 

You can manually review reviewed and analyze this information to understand the investment strategies and trends of the institutional investor.

Figure. 3: 'BERKSHIRE HATHAWAY' 2023–11–14 13F fillings. Source: SEC EDGAR Database.

1.2 Accessing the Data via the FMP API using Python

The Financial Modeling Prep (FMP) API conveniently accesses data on institutional investors’ portfolio holdings, tracking purchases, sales, and value changes over time.

The API further enhances the data points by including metrics on securities held, shares owned, market values, and changes in portfolio weights over time. This enables you to analyze the investment strategies and trends of the tracked institutional investors.

The code snippet below demonstrates how to use the FMP API to fetch the portfolio holdings data for a set of the top largest institutional investors using their CIKs (Central Index Keys) and quarter-end dates:

				
					import requests
import pandas as pd

# Define the base URL for the API endpoint
url = 'https://financialmodelingprep.com/api/v4/institutional-ownership/portfolio-holdings'

# List of quarter-end dates for 2023
quarter_end_dates = [
    '2022-03-31', '2022-06-30', '2022-09-30', '2022-12-31',
    '2023-03-31', '2023-06-30', '2023-09-30', '2023-12-31',
    '2024-03-31'  
]

# List of CIKs you're interested in
ciks = [
        '0001067983', 
        '0000019617', 
        '0000070858', 
        '0000895421', 
        '0000886982', 
        ]

# Initialize an empty DataFrame to append each quarter's data for all CIKs
all_data_df = pd.DataFrame()

# Your API key (replace with your actual API key)
api_key = ''

for cik in ciks:
    for date in quarter_end_dates:
        # Parameters for the GET request
        params = {
            'date': date,  # Use the current quarter-end date
            'cik': cik,  # Use the current CIK
            'page': 0,  # Assuming you're only interested in the first page of results
            'apikey': api_key
        }
        
        # Make the GET request for the current quarter and CIK
        response = requests.get(url, params=params)
        
        if response.status_code == 200:
            # If the request was successful, append the data to the DataFrame
            data = response.json()
            quarter_cik_df = pd.DataFrame(data)
            all_data_df = pd.concat([all_data_df, quarter_cik_df], ignore_index=True)
        else:
            print(f"Failed to retrieve data for CIK {cik} on {date}: {response.status_code}")
				
			
				
					{
"date": "2023-06-30",
"cik": "0001067983",
"filingDate": "2023-08-14",
"symbol": "AAPL",
"securityName": "APPLE INC",
"typeOfSecurity": "COM",
"securityCusip": "037833100",
"sharesType": "SH",
"putCallShare": "Share",
"investmentDiscretion": "DFND",
"weight": 51.0035,
"lastWeight": 46.4386,
"changeInWeight": 4.5649,
"changeInWeightPercentage": 9.8301,
"sharesNumber": 915560382,
"lastSharesNumber": 915560382,
"changeInSharesNumber": 0,
"changeInSharesNumberPercentage": 0,
"isNew": false,
"firstAdded": "2016-03-31",
"quarterEndPrice": 193.97,
"avgPricePaid": 41.18,
"isSoldOut": false,
"ownership": 5.7994,
"lastOwnership": 5.7994,
"changeInOwnership": 0,
"changeInOwnershipPercentage": 0,
"holdingPeriod": 30,
"isCountedForPerformance": true,
"marketValue": 177591247296,
"lastMarketValue": 150975906991,
"changeInMarketValue": 26615340305,
"changeInMarketValuePercentage": 17.6289,
"performance": 26615340304,
"lastPerformance": 31302912039,
"changeInPerformance": -4687571734,
"performancePercentage": 17.6289,
"investorName": "BERKSHIRE HATHAWAY INC",
"industryTitle": "ELECTRONIC COMPUTERS"
}
				
			

FMP grants access to a wide array of financial data, including but not limited to,  trading activity by institutional Investors. Sign up for the API here.

Sign up to FMP with 15% Discount. Free version of the API also available.

1.3 Scraping the Data Automatically using Selenium

While the FMP API provides a convenient way to access institutional investor portfolio data with enhanced metrics, there may be cases we’d want to retrieve the data ourselves in an open-source way.

In such scenario, you can leverage web scraping techniques to extract the information directly from the SEC’s EDGAR database.

It’s important to note that web scraping can be more fragile than using a dedicated API, as changes in the website structure or layout can potentially disrupt the scraping process. 

Additionally, web scraping may be subject to rate limits or other restrictions imposed by the SEC, and excessive requests may result in timeouts or other issues as the system detects the automated activity. 

Therefore, it’s crucial to review and comply with any relevant terms of service or regulations when using this approach.

Step 1 – Retrieve all the URLs and XML Links to be Scraped

The first step involves setting up the necessary variables, including the Central Index Key (CIK) for the institution (in this case, Berkshire Hathaway), request headers (to mimic a legitimate web browser session), and a flag to include or exclude amendments in the filings.

Block 1: Fetching the Submission Date for Filings

  • We use the requests library to call the SEC’s API, fetching filings for a specific company using its CIK.
  • The response is filtered to extract only 13F forms, which are quarterly reports filed by institutional investment managers.
  • Additionally, the script captures the filing date and the URL for each 13F filing, storing this information for later use.

Block 2: Fetching the XML URLs from the Filings

  • A headless browser session is initiated using Selenium’s WebDriver. This allows us to navigate to the report URLs of the 13F filings without the need for a graphical user interface.
  • For each filing, the script waits for the page to load and then locates the XML URL that contains the detailed holdings information.
  • Each identified XML URL is added to a list for further processing.

2. Analysing Portfolio Over Time

Now that we have successfully retrieved the institutional investor portfolio data, we can start analyzing the changes in their holdings over time. This analysis will provide valuable insights into the investment strategies and trends favored by these influential market players.

The first step is to convert the ‘date’ column to a datetime format, which allows us to easily extract the year and quarter information for each data point. This is crucial for the subsequent analysis, as we want to understand how the portfolios have evolved on a quarterly basis.

Next, we select and rename the relevant columns from the original DataFrame. Specifically, we keep the ‘year’, ‘quarter’, ‘investorName’, ‘symbol’ (renamed to ‘ticker’), ‘weight’, and ‘marketValue’ (renamed to ‘value’) columns.

The key step here is to calculate the total market value of each investor’s portfolio in each quarter. We use the groupby() function to aggregate the ‘value’ column by ‘year’, ‘quarter’, and ‘investorName’, storing the total portfolio value in a new column called ‘total_portfolio_value’.

With the total portfolio values calculated, we can now derive the weight of each asset as a percentage of the total portfolio. We create a new column called ‘computed_weight’ that represents this percentage, and then optionally rename it back to ‘weight’ for consistency with the rest of the code.

At this point, the transformed_df contains all the necessary information for analyzing the portfolio changes over time, including the year, quarter, investor name, ticker, weight, and total portfolio value.

This DataFrame can now be used for further analysis and visualization, such as tracking the changes in holdings, identifying significant position changes, and understanding the overall investment strategies of the tracked institutions.

				
					import requests
import pandas as pd

# Define the base URL for the API endpoint
url = 'https://financialmodelingprep.com/api/v4/institutional-ownership/portfolio-holdings'

# List of quarter-end dates for 2023
quarter_end_dates = [
    '2022-03-31', '2022-06-30', '2022-09-30', '2022-12-31',
    '2023-03-31', '2023-06-30', '2023-09-30', '2023-12-31',
    '2024-03-31'  
]

# List of CIKs you're interested in
ciks = [
        '0001067983', 
        '0000019617', 
        '0000070858', 
        '0000895421', 
        '0000886982'
        ]

# Initialize an empty DataFrame to append each quarter's data for all CIKs
all_data_df = pd.DataFrame()

# Your API key (replace with your actual API key)
api_key = ''

for cik in ciks:
    for date in quarter_end_dates:
        # Parameters for the GET request
        params = {
            'date': date,  # Use the current quarter-end date
            'cik': cik,  # Use the current CIK
            'page': 0,  # Assuming you're only interested in the first page of results
            'apikey': api_key
        }
        
        # Make the GET request for the current quarter and CIK
        response = requests.get(url, params=params)
        
        if response.status_code == 200:
            # If the request was successful, append the data to the DataFrame
            data = response.json()
            quarter_cik_df = pd.DataFrame(data)
            all_data_df = pd.concat([all_data_df, quarter_cik_df], ignore_index=True)
        else:
            print(f"Failed to retrieve data for CIK {cik} on {date}: {response.status_code}")

# Process the fetched data
all_data_df['date'] = pd.to_datetime(all_data_df['date'])
all_data_df['year'] = all_data_df['date'].dt.year
all_data_df['quarter'] = all_data_df['date'].dt.quarter
transformed_df = all_data_df[['year', 'quarter', 'investorName', 'symbol', 'weight', 'marketValue']].copy()
transformed_df.rename(columns={'symbol': 'ticker', 'marketValue': 'value'}, inplace=True)
transformed_df['value'] = pd.to_numeric(transformed_df['value'])

# Calculate and merge total portfolio values
total_market_values = transformed_df.groupby(['year', 'quarter', 'investorName'])['value'].sum().reset_index(name='total_portfolio_value')
transformed_df = pd.merge(transformed_df, total_market_values, on=['year', 'quarter', 'investorName'])
transformed_df['computed_weight'] = transformed_df['value'] / transformed_df['total_portfolio_value']
transformed_df.drop('weight', axis=1, inplace=True)
transformed_df.rename(columns={'computed_weight': 'weight'}, inplace=True)

# Filter and prepare final_df for plotting
final_df = transformed_df[transformed_df['weight'] > 0.05].copy()
final_df['period'] = final_df['year'].astype(str) + ' Q' + final_df['quarter'].astype(str)
final_df.sort_values(by=['period', 'investorName', 'ticker'], inplace=True)
total_values = final_df.groupby(['period', 'investorName'])['value'].sum().reset_index(name='totalValue')
final_df = pd.merge(final_df, total_values, on=['period', 'investorName'], how='left')
final_df['totalValueText'] = '$' + (final_df['totalValue'] / 1e9).round(2).astype(str) + 'B'

fig, ax = plt.subplots(figsize=(24, 12))

periods = final_df['period'].unique()
investorNames = final_df['investorName'].unique()
tickers = final_df['ticker'].unique()

# Assign a unique color to each ticker
colors = {ticker: plt.cm.tab20(i/len(tickers)) for i, ticker in enumerate(tickers)}

# Position index for each period and investorName
period_indices = {period: i for i, period in enumerate(periods)}
investor_positions = {investorName: i for i, investorName in enumerate(investorNames)}

bar_width = 0.2
spacing = 0.02

for period in periods:
    period_position = period_indices[period]
    
    for investorName in investorNames:
        investor_df = final_df[(final_df['period'] == period) & (final_df['investorName'] == investorName)]
        if not investor_df.empty:
            bottom = 0
            totalValueText = investor_df['totalValueText'].iloc[0]  # Get the total value text for the investor
            
            for _, row in investor_df.iterrows():
                bar_position = period_position + investor_positions[investorName]*(bar_width + spacing)
                weight = row['weight']
                ax.bar(bar_position, weight, bottom=bottom, width=bar_width, color=colors[row['ticker']], edgecolor='white')
                bottom += weight
                
                # Retaining the original segment labels within the bar
                ax.text(bar_position, bottom - (weight / 2), f"{row['ticker']}\n{row['weight']*100:.1f}%\n${row['value']/1e9:.1f}B", ha='center', va='center', fontsize=9, rotation=90)

            # Adding total portfolio value next to the investor name
            label_with_value = f"{investorName} ({totalValueText})"
            ax.text(bar_position, bottom + 0.03, label_with_value, ha='center', va='bottom', fontsize=12, rotation=90, transform=ax.transData)

ax.set_xticks([i + (bar_width + spacing) * len(investorNames) / 2 - bar_width/2 for i in range(len(periods))])
ax.set_ylim(0, ax.get_ylim()[1] * 1.4)
ax.set_xticklabels(periods)
ax.set_ylabel('Weight (%)')
ax.set_title('Quarterly Portfolio Composition for Investors (Assets Exceeding 5% Allocation)', fontsize = 20)

plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
				
			

Figure. 3: 'BERKSHIRE HATHAWAY' 2023–11–14 13F fillings. Source: SEC EDGAR Database.

3. Buy and Sell Activity by Quarter

Analyzing the buy and sell activity of institutional investors across their portfolio holdings on a quarterly basis. By understanding the changes in portfolio weights, we can gain insights into the investment strategies and trading patterns.

1. Define a custom colormap:

We start by creating a custom colormap using the LinearSegmentedColormap from Matplotlib’s colors module. This colormap will be used to represent the relative changes in portfolio weights, with red indicating a decrease, white indicating little to no change, and green indicating an increase.

2. Identify the unique quarters:

We extract the unique quarters from the all_data_df DataFrame and sort them in reverse chronological order.

3. Iterate through the quarters:

For each quarter, we perform the following steps:

  • Filter the DataFrame to get the data for the current quarter.
  • Prepare the data for the heatmap by grouping the data by ‘investorName’ and ‘symbol’, and calculating the mean of the ‘relative_change_pct’ column.
  • Pivot the grouped data to create a matrix-like structure, with investors as the rows, tickers as the columns, and the mean relative change percentage as the cell values.

4. Create the heatmap visualization:

We use the Seaborn heatmap() function to create a heatmap plot for the current quarter. The heatmap is colored using the custom colormap defined earlier, with the center value set to 0 (no change).

5. Customize the plot:

We adjust the figure size, linewidths, and linecolor to improve the readability and aesthetics of the heatmap. Additionally, we set the x-axis and y-axis labels, the title, and the rotation of the tick labels.

6. Display the plot:

Finally, we display the heatmap using plt.show().

The resulting heatmap provides a clearer visual representation of the buy and sell activity for each investor and their portfolio holdings during the selected quarter. 

The color-coding allows to quickly identify the assets that have experienced increases, decreases, or little to no change in their portfolio weights.

				
					import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.colors import LinearSegmentedColormap

# Define a custom colormap
colors = ["red", "white", "green"]  # Red to White to Green for decrease, no change, and increase
cmap_name = "custom1"
custom_cmap = LinearSegmentedColormap.from_list(cmap_name, colors)

# Unique quarters for iteration
quarters = all_data_df['quarter_year'].unique()
quarters = sorted(quarters, key=lambda x: pd.Period(x), reverse=True)

for quarter in quarters:
    # Filter the DataFrame for the current quarter
    quarter_df = all_data_df[all_data_df['quarter_year'] == quarter]
    
    # Prepare the data for the heatmap
    quarter_grouped = quarter_df.groupby(['investorName', 'symbol'])['relative_change_pct'].mean().reset_index()
    quarter_pivot = quarter_grouped.pivot_table(index='investorName', columns='symbol', values='relative_change_pct', fill_value=0)
    
    # Plotting with adjusted linewidths and figure size
    plt.figure(figsize=(30, 15))  # Increase figure size for better readability
    ax = sns.heatmap(quarter_pivot, cmap=custom_cmap, center=0, annot=False, cbar=True,
                     linewidths=0.05, linecolor='gray')  # Adjust linewidths and linecolor here
    plt.title(f'Relative Portfolio Value Change for {quarter} (Green: Increase, Red: Decrease, White: Little/No Change)', fontsize=16)
    plt.xlabel('Asset (Ticker)', fontsize=14)
    plt.ylabel('Investor Name', fontsize=14)
    plt.xticks(rotation=90, fontsize=12)  # Adjust fontsize for better readability
    plt.yticks(rotation=0, fontsize=12)
    plt.show()
				
			

4. Portfolio Performance 

To compute a time series of returns for each investment company, considering both the change in market value due to price changes and the effect of buying or selling shares, we’ll follow a structured approach. Here’s the outline of steps we’ll take:

Step 1: Calculate Periodic Holding Changes

For each asset within each investment company, calculate the change in the number of shares held between consecutive periods. This will help identify periods of buying or selling activities.

Step 2: Calculate Periodic Market Value Changes

Calculate the market value change for each asset that is not attributable to buying or selling activities. This isolates the effect of market price changes on the portfolio’s value.

Step 3: Calculate Period Returns for Each Asset

Use the data to calculate the return for each asset in each period, factoring in both holding changes and market value changes. The formula for calculating the return of an asset in a period could be simplified to:

formula returns entreprenerdly

5. Practical Applications and Insights

The analysis of institutional investor portfolios can provide valuable insights and have practical applications for various stakeholders in the financial markets.

5.1 Identifying Investment Opportunities

Tracking the investment patterns and trading activity of large institutions can help identify promising investment opportunities. Observing the sectors, industries, or individual securities that are gaining attention from sophisticated investors can signal potential undervalued or emerging prospects.

5.2 Evaluating Market Sentiment

The portfolio changes and buy/sell behavior of institutional investors can serve as a barometer for overall market sentiment. Monitoring the shifts in their allocations can offer insights into the perceived risks and opportunities, informing investment decisions and portfolio management.

5.3 Monitoring Industry Trends

Analyzing the portfolio compositions and weight changes across different sectors and industries can reveal evolving trends and growth potential. This information can be valuable for investors, analysts, and policymakers to make more informed decisions.

Concluding Thoughts

Understanding what institutional investors are buying and selling, can give us a leg up on identifying promising opportunities and avoiding potential pitfalls. 

Of course, we’ll need to stay up to date with the data sources and reporting rules as they evolve. But if we can keep up with the changes, the insights we gain could make a real difference in how we approach our own investments.

Prev Post

Restoring Image Quality With AI using Real-ESRGAN and SwinIR

Next Post

Introduction to Large Action Models – The Next AI Frontier

post-bars
Mail Icon

Newsletter

Get Every Weekly Update & Insights

[mc4wp_form id=]

Leave a Comment