Week 4: 30th Sept – 4th Oct

This week, I focused almost entirely on one major task: reverse-engineering the formulas used by Samsung to calculate specific data points from their raw data. This challenge was critical because, with our decision to use Google Health Connect, we no longer had direct access to Samsung’s pre-calculated metrics, which included:

  • Physical Recovery Score
  • Mental Recovery Score
  • Restfulness
  • Overall Sleep Score

These data points were significant to our project, as they offered valuable insights that raw data alone couldn’t provide.

Reverse-Engineering Samsung’s Formulas

Having access to both the raw data and Samsung’s calculated metrics, I began experimenting to uncover the underlying formulas. Through trial and error, I identified that the calculations relied on multiple factors, each weighted differently to produce the final scores. My approach unfolded as follows:

  1. Identifying Factors: I reviewed Samsung’s minimal descriptions of the calculated metrics to infer the possible factors involved. For example, restfulness might depend on variables like movement during sleep, total sleep time, and time spent in deep sleep.
  2. Testing Mathematical Relationships: Using a few raw data sets and their corresponding calculated scores, I attempted to determine the weights assigned to each factor. While this was straightforward for a single data set, it became clear that the weights needed to work consistently across all data sets to be valid.
  3. Challenges with Consistency: When applying the weights derived from one set of data to others, the results varied drastically, indicating that my initial method wasn’t robust enough. This led to significant frustration as I realized the need for a more comprehensive approach.

Developing a Web UI

To simplify the process of testing and visualizing results, I developed a web-based UI that:

  • Allowed input of raw data sets.
  • Displayed the calculated results based on the derived weights.
  • Provided tools for analysing and comparing the results across multiple data sets.

This tool not only streamlined my workflow but also allowed for quick iteration and testing, saving time in the long run.

Brute-Forcing the Solution

After struggling with mathematical workflows, I resorted to a brute-force approach. Instead of manually deriving the weights, I wrote a program to compute every possible combination of weights across 12 data sets:

  • The program performed approximately 96 million calculations in a matter of seconds, identifying the optimal set of weights that worked consistently across all data sets.
  • The derived weights were then validated against new data sets, producing accurate and reliable results.

Final Integration

With the successful weights in hand, I integrated them into the web UI, enabling the team to input raw data and instantly view the calculated metrics. This allowed us to move forward confidently with our use of Google Health Connect, bridging the gap left by the absence of Samsung’s pre-calculated data.

Reflections

This week highlighted the importance of persistence and adaptability when solving complex problems. While I initially resisted the brute-force approach, it ultimately provided the solution we needed. Developing the web UI was another valuable step, offering both functionality and a platform to test future data-related features.

This task also gave me a deeper appreciation for balancing mathematical rigor with practical, computational solutions, a skill that will undoubtedly prove useful in the later stages of the project.