Whale Behavior and Market Dynamics Analysis from a Data Scientist’s Perspective

XXXX ETH transferred from an unknown wallet to Coinbase Institutional. Can we gain more insights beyond just 'unknown wallet'? As a data science engineer, I've embarked on a solo project to analyze the correlation between whale transactions and subsequent market dynamics. I am not a financial and crypto market expert, so I post my approach here for constructive suggestions, and once the project is completed, I will make the application publicly available.

My initial analysis focuses solely on the ETH mainnet (with plans to expand to other chains and DEXs if the approach proves meritorious). I examine how large transfers correlate with market movements after a certain time delay.

The approach delivers market insights in two ways: visualization and alerting:

figure 1: the lower left plot of figure 1 shows a potentially monotonic relationship between a wallet’s transactions and the market log return after a 5-hour lag.

  1. Visualization: Analyzing and plotting a wallet's transaction history against market responses over specific time delays (ranging from 1 hour to several days, selected using a cross-correlation algorithm to identify the most significant correlation delay).
  • For example, for a given whale address, the lower left plot of figure 1 reveals a potential monotonic relationship between the wallet’s transaction volume and the market log return after a 5-hour lag. Inflow transactions may precede market gains, whereas outflow transactions could indicate losses.
  • A color-coding scheme highlights the temporal nature of transactions, with yellow-ish dots for recent transactions and purple-ish dots for earlier ones.
  1. Alerting System: Working with services like Chainbot or Whale Alert, my system aims to notify users of significant fund movements, linking historical transaction data with market changes. For example, “On XX-Jan-2024, an outflow of XXXX ETH from the address preceded a market change of XXX% after a delay of XXX hours”. Determining which transaction events are significant enough to report, however, remains a challenge.

There are several limitations to this project:

  1. I've only analyzed ETH mainnet on-chain data since January 1, 2021, focusing on transactions equal to or exceeding 1 million dollars. I treat all transactions the same, but when involving smart contract interactions, the impact can vary significantly. Additionally, significant ETH-related activity on other chains and decentralized exchanges (DEXs) is not captured.
  2. I'm still in the process of gathering labels for whale addresses to determine whether they belong to an organization or exchange.
  3. The historical balance of wallets has not been retrieved yet.

I welcome constructive suggestions and criticism of my approach. If it proves valuable, I will regularly report on development progress and eventually make the analysis tool publicly available.

reddit image