1,613 companies were clustered using 20 years of balance sheet and income statement data collected from the EDGAR API (developer.edgar-online.com). The K-Modes clustering algorithm was used to cluster all publicly traded companies in America (excluding delisted companies, companies lacking data, and companies with extreme outliers) based on three financial ratios. The clustering by the three ratios allowed for alike companies to be clustered and analyze future returns per cluster. K-Modes was used because it separated the data best regarding specific thresholds. If the data met the threshold, it would be flagged as 1 and if not it would be flagged 0. The analysis sought to prove whether the three financial ratios can quantitatively filter for valuable investments that provided satisfactory returns over time.
Note - all ratios were calculated using the data from the year 2000 to simulate an investor analyzing companies for future returns from that starting point. The analysis can be taken advantage of by any individual or entity looking to filter for valuable investments for capital allocation opportunity.
Note - The same companies in the value Index may not be suitable for stock picking into today's current environment due to the analysis takes into consideration purchase price and the companies economics may have deteriorated. The Analyst should run the algorithm again and revalue the companies.