Today’s post is a short introduction into using FastDtw for Dynamic Time Warping analysis. Specifically it is a quick introduction to using my newly released FastDtw package.
Dynamic Time Warping (DTW) can be a useful tool when comparing signals or series while adjusting for frequency variance. One downside to the standard algorithm is it can be expensive. There are many ways to mitigate the cost, FastDtw is one of them. While other methods typically hard-cap parameters on the search space, FastDtw provides a dynamic approach to reducing the search space while maintaining a higher level of accuracy. In this post I’ll discuss the usage of my F# package implementation of the FastDtw: Toward Accurate Dynamic Time Warping in Linear Time and Space paper. For those interested in the details, it is a pretty accessible paper. FastDtw is technically an approximation, but its flexible pathing strategy provides good results with some significant performance improvements over basic DTW.
To get started, you will need to have .NET Core installed. Once this is complete the application can be setup.
1 | dotnet new console -lang F# -n BitCoinTrends |
As always, there is some basic setup. I’ll include FastDtw
(obviously), and a charting library for some series comparison visualizations.
1 | open System |
The example will take a csv of Bitcoin/USD conversion values and transform it into a list of datasets broken down by month. The file is a 2010-2019 data download from Yahoo Finance. The format can be seen below, for today’s purposes I only care about the Data and Close fields.
1 | Date,Open,High,Low,Close,Adj Close,Volume |
I want to break the file into datasets by month. Since this is just a small script I’ll make a mini file processing pipeline to group and normalize the data by converting to a percentage of the previous day. This improves comparisons, especially with something as volatile as bitcoin.
1 |
|
Once the datasets are loaded, it is time to see which month most closely mirrors the trend of 9/2019. For all the other code in this post, this is really the highlight: let distance = FastDtw.Distance targetData data radius
. This is where the comparison happens. Radius allows a configurable level of accuracy. It controls the per point search space as the series are compared. In most cases, distance is all that matters, but there are times when how the series match up can be useful. The DistanceWithPath
function provides the series’ indexes that pair together.
1 | let targetMonth = "2019-09" |
Once the comparisons have been completed, it is time to show some comparison charts. I print out the top couple matches as well as a random match.
1 | let showChart targetData compareData title seriesName = |
Once the code is in place, it is time to look at some of the results. It is useful to not only see a good match, but an average match. This helps with the contrast. There is no guarantee there will be a good match, but of the months provided, 2012-10 isn’t a bad match to 2019-09, especially considering what a random match can produce. If you squint a bit, you even can see the portions of similar trends over a month, if only the elapsed time and amplitude is different.
Beyond the example, it is useful to see some of the performance differences with the stock DTW algorithm. For this I setup a quick BenchmarkDotNet test. The below results show the performance benefits. FastDtw is a multi-pass algorithm, so it’s not surprising that on very short series it’s overhead makes it slower. Comparisons for larger series get anywhere from 65% - 70% faster. One possible downside of the current implementation is more allocations, thus the extra GC events. This is one of those cases where the allocations can be reduced with a bit of refactoring, so this looks like a good place for future optimizations.
1 | | Method | SeriesSize | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | |
This has been a quick introduction into the FastDtw package. I hope you’ve enjoyed this. Until next time.