Coastal Mountain Walks: Exploring Google Fit Data

In early January 2026, I spent two weeks in and around Nerja, a small coastal town on the eastern edge of Málaga province in southern Spain. It's a place where the Sierra Almijara mountains meet the Mediterranean, and that geography made for some genuinely rewarding walking. Over the course of the trip I explored the town itself, made three trips up to the whitewashed village of Frigiliana perched in the hills above, and did a number of hikes further into the mountains, some with quite challenging terrain. Komoot's map markers were occasionally more optimistic than the trails themselves.

Throughout the trip I tracked my activity with Google Fit on my phone, capturing steps, distance, move minutes, calories, heart points, and speed data for most of my outdoor movement. A few caveats worth keeping in mind: I didn't have my phone on me at all times, calisthenics sessions and daily stretching went untracked, and travel days to and from Nerja are excluded. Without continuous heart rate or GPS data, metrics like calories and heart points are estimates at best, so I'm treating step count as the most meaningful number to focus on.

This notebook explores that activity data day by day. The goal isn't rigorous sports science; it's a personal record of how much and when I was moving through a landscape I really enjoyed. If you're curious about walking in this part of Andalusia, I hope it gives you a flavor of what's on offer.

Setup

Let's start by importing the three main libraries used throughout this notebook: Polars for data manipulation, NumPy for numerical operations, and Altair for visualization. A few Polars display settings are configured upfront to make sure larger tables render cleanly.

I also define two small helper functions used across all charts. chart_text generates a consistent Altair title block with the notebook title, attribution, and an optional subtitle. dt_convert parses datetime strings from the raw CSV files and converts them to Madrid local time, stripping the timezone info afterward for easier handling.

import altair as alt
import numpy as np
import polars as pl

pl.Config.set_tbl_rows(50)
pl.Config.set_tbl_cols(50)
pl.Config.set_fmt_str_lengths(100)


attribution = [
    'Google Fit data tracked with a phone during a vacation in and around Nerja, Spain (January 3-15, 2026)',
    'Author: Ramiro Gómez • ramiro.org'
]

columns = ['Start time', 'End time', 'Move Minutes count', 'Calories (kcal)', 'Distance (m)', 'Heart Points', 'Step count', 'Min speed (m/s)', 'Max speed (m/s)', 'Average speed (m/s)']

main_metrics = ['Steps', 'Move Minutes', 'Distance (m)', 'Average speed (m/s)', 'Heart Points', 'Calories (kcal)']


def chart_text(title, subtitle=None, offset=10):
    subtitle = [subtitle] + attribution if subtitle else attribution
    return alt.Title(
       f'{title} • Coastal Mountain Walks',
       subtitle=subtitle,
       anchor='start',
       frame='group',
       orient='bottom',
       offset=offset)


def dt_convert(col):
    return col.str.strptime(pl.Datetime, format='%Y-%m-%d %H:%M:%S.%3f%z').dt.convert_time_zone('Europe/Madrid').dt.replace_time_zone(None)

Next, let's load the data. The Google Fit data exported through Google Takeout contains a separate CSV file for each day, so the code loops over the date range of the trip (January 3 to 15), reads each file, and selects the columns we care about. The start and end times are combined with the date and converted to local time using the dt_convert helper defined above. A couple of columns get renamed to shorter aliases while two redundant originals are dropped. All daily subsets are then concatenated into a single DataFrame, with Day and Hour columns extracted from the start time for easier grouping later. A summary of descriptive statistics is shown to give a feel for the data's shape and range.

day_range = range(3, 16)
month = 1
year = 2026
all_data = []

for day in day_range:
    s_date = f'{year}-{month:02d}-{day:02d}'
    data = pl.read_csv(f'~/data/health/google-fit/Daily activity metrics/{s_date}.csv')
    subset = data.select(pl.col(columns)).with_columns(
        pl.col('Move Minutes count').alias('Move Minutes'),
        pl.col('Step count').alias('Steps'),
        dt_convert(pl.lit(s_date) + ' ' + pl.col('Start time')).alias('Start'),
        dt_convert(pl.lit(s_date) + ' ' + pl.col('End time')).alias('End'),
    ).drop(['Start time', 'End time', 'Step count', 'Move Minutes count'])
    all_data.append(subset)

df = pl.concat(all_data).with_columns(
    pl.col('Start').dt.day().alias('Day'),
    pl.col('Start').dt.hour().alias('Hour')
)

df.describe()
shape: (9, 13)
statisticCalories (kcal)Distance (m)Heart PointsMin speed (m/s)Max speed (m/s)Average speed (m/s)Move MinutesStepsStartEndDayHour
strf64f64f64f64f64f64f64f64strstrf64f64
"count"1248.0468.0222.0438.0438.0438.0395.0523.0"1248""1248"1248.01248.0
"null_count"0.0780.01026.0810.0810.0810.0853.0725.0"0""0"0.00.0
"mean"24.703503442.4227775.5180180.390211.1994720.7878659.159494562.409178"2026-01-09 11:52:30""2026-01-09 11:52:30"9.011.5
"std"16.559387392.5702954.5111370.1718050.6103410.3616265.063794531.12665nullnull3.7431576.924962
"min"16.4270830.631861.00.2455690.2520140.2520141.00.0"2026-01-03 00:00:00""2026-01-03 00:00:00"3.00.0
"25%"16.42708363.3622972.00.3177570.6461060.4737634.064.0"2026-01-06 06:00:00""2026-01-06 06:00:00"6.06.0
"50%"16.427083359.3509194.00.3287141.2717280.76761710.0388.0"2026-01-09 12:00:00""2026-01-09 12:00:00"9.012.0
"75%"16.427083741.8639148.00.3953761.5323941.04592314.01019.0"2026-01-12 17:45:00""2026-01-12 17:45:00"12.017.0
"max"81.3659981569.80535423.01.6433593.2435161.88955215.01936.0"2026-01-15 23:45:00""2026-01-15 23:45:00"15.023.0

The raw data is recorded in 15-minute intervals, so let's aggregate it into two views we'll use throughout the notebook: one grouped by day and one by day and hour. Summing makes sense for cumulative metrics like steps, distance, and calories, while speed is averaged (mean) and the min/max values are preserved across intervals.

aggs = [
    pl.col('Average speed (m/s)').mean(),
    pl.col('Max speed (m/s)').max(),
    pl.col('Min speed (m/s)').min(),
    pl.col('Calories (kcal)').sum(),
    pl.col('Distance (m)').sum(),
    pl.col('Heart Points').sum(),
    pl.col('Move Minutes').sum(),
    pl.col('Start').first().alias('Start'),
    pl.col('Steps').sum(),
]

by_day = df.group_by('Day', maintain_order=True).agg(aggs)
by_day_hour = df.group_by('Day', 'Hour', maintain_order=True).agg(aggs)

Daily Activity Metrics

A good place to start is a broad overview of all six metrics across the 13 days. The data is first reshaped into long format to make it easy to plot each metric as its own panel. The facets are ordered by reliability and importance (steps first, calories last), with each panel using an independent y-axis scale since the metrics have very different units and magnitudes.

# Reshape to long format
long_df = by_day.unpivot(
    index=['Day'],
    on=main_metrics,
    variable_name='Metric',
    value_name='Value'
)
# Create a facet chart
alt.Chart(long_df).mark_bar().encode(
    x=alt.X('Day:O', title=None),
    y=alt.Y('Value:Q', title=None),
    color=alt.Color('Metric:N', legend=None),
).properties(
    height=200,
    width=280
).facet(
    facet=alt.Facet('Metric:N', sort=main_metrics, title=None),
    columns=3,
    title=chart_text('Daily Activity Metrics')
).resolve_scale(
    y='independent'
)

A few days stand out immediately. Jan 13 ranks highest across almost every metric. It was the longest and most demanding day of the trip, a there-and-back hike to Frigiliana via the Río Higuerón Canyon. Jan 8 also scores high on steps and distance despite being a mountain hike with some genuinely tricky navigation. The two quieter days, Jan 7 and Jan 9, were more or less recovery days between the more demanding outings. Jan 5 scores surprisingly well across all metrics despite not involving particularly challenging terrain.

Steps by Day and Hour

Next, let's look at when during the day the steps were accumulated. This punchcard chart plots each day against each hour, with circle size encoding the step count and color showing move minutes, giving a sense of both volume and intensity of movement at a glance. Hours with no recorded steps are omitted.

col = 'Steps'
by_day_hour.filter(pl.col(col) > 0).plot.circle(
    x='Hour:O',
    y='Day:O',
    color=alt.Color('Move Minutes:N').bin(step=15).legend(title='Move Minutes').scale(scheme='purplebluegreen'),
    size=alt.Size(col, scale=alt.Scale(domain=[1, by_day_hour[col].max()]))
).properties(
    height=500,
    width=700,
    title=chart_text(f'Total {col} by Day and Hour')
)

Most days follow a broadly similar pattern, with activity picking up in the late morning and peaking between noon and 3pm. Jan 10 has the single highest step count in any one hour: nearly 6,000 steps at noon, part of a sustained push from midday through to 6pm. Jan 13, the busiest day overall, shows the widest spread of activity, with meaningful step counts running from 8am all the way through to 9pm, a reflection of the long there-and-back route via the Río Higuerón Canyon.

Jan 6, the Frigiliana hike with river crossings, shows concentrated activity from 11am through 6pm, with the bus back to Nerja around 5pm. Jan 7 stands out as the quietest day, with low, evenly distributed steps and no dominant peak, consistent with a relaxed town day. Jan 15 has a notable morning gap because I left my phone at home during a couple of short swims at the beach, so the activity only picks up from 1pm onward.

Daily Activity Profiles

To compare days more fairly across all six metrics at once, each metric is normalized to a percentile rank from 0 to 100. This shows where each day sits relative to the others across the full range of measurements. Larger, more evenly shaped profiles indicate days that were consistently active across multiple dimensions.

# Normalize metrics using a percentile ranking (0-100 for fair comparison
normalized = by_day.with_columns([
    (pl.col(m).rank() / pl.col(m).count() * 100).alias(m)
    for m in main_metrics
])

# Create angle mapping
len_m = len(main_metrics)
angles_df = pl.DataFrame({
    'Metric': main_metrics,
    'Angle': [i * 360 / len_m for i in range(len_m)]
})

# Reshape to long format and join with angle mapping
long_df = normalized.unpivot(
    index=['Day'],
    on=main_metrics,
    variable_name='Metric',
    value_name='Value'
).join(angles_df, on='Metric')

# Convert to x, y coordinates
long_df = long_df.with_columns(
    (pl.col('Value') * np.cos(pl.col('Angle') * np.pi / 180)).alias('x'),
    (pl.col('Value') * np.sin(pl.col('Angle') * np.pi / 180)).alias('y')
)

# Create radar chart
alt.Chart(long_df).mark_line(point=True, filled=True, opacity=0.3).encode(
    x=alt.X('x:Q', axis=None),
    y=alt.Y('y:Q', axis=None),
    color=alt.value('steelblue'),
    order='Angle:Q',
    tooltip=['Day:O', 'Metric:N', alt.Tooltip('Value:Q', format='.1f', title='Normalized (0-100)')]
).properties(
    width=170,
    height=170
).facet(
    facet=alt.Facet('Day:O', title='January', header=alt.Header(labelExpr="'Day ' + datum.value")),
    columns=5,
    title=chart_text(
        'Daily Activity Profiles',
        subtitle='Each metric shown as percentile rank (0-100) across all days. Larger shapes indicate more active days.')
)

Jan 13 and Jan 5 both show large profiles with nearly equal area, though their shapes differ. Jan 5 ranks at the top for average speed and heart points, while Jan 13 spreads its strength more evenly across distance and move minutes. Jan 3 forms an almost perfect hexagon, ranking consistently high across all six metrics.

Jan 8, despite high step counts and distance, shows an irregular shape due to lower average speed, reflecting the terrain and navigation challenges of that mountain route. Jan 7 is predictably small and compressed, ranking at the bottom across the board, while Jan 11 displays a lopsided profile with strong speed and many heart points but weaker performance elsewhere. Days 4 and 9 are also quite small, consistent with lighter activity, and Jan 15 sits somewhere in the middle.

Correlation Matrix: Activity Metrics

This scatterplot matrix shows how different metrics relate to each other across all 15-minute intervals throughout the trip. Each point represents a single interval, colored by time of day. Strong linear patterns indicate metrics that move together, while scattered plots suggest more independent variation.

plot_data = df.with_columns(
    pl.when(pl.col('Hour') < 12).then(pl.lit('Morning'))
      .when(pl.col('Hour') < 18).then(pl.lit('Afternoon'))
      .otherwise(pl.lit('Evening')).alias('Time of Day')
)

plot_data.plot.circle(
    x=alt.X(alt.repeat('column'), type='quantitative'),
    y=alt.Y(alt.repeat('row'), type='quantitative'),
    color=alt.Color('Time of Day:N',
        legend=alt.Legend(title='Time of Day'),
        sort=['Morning', 'Afternoon', 'Evening']),
    opacity=alt.value(.7)
).properties(
    width=240,
    height=240
).repeat(
    column=['Distance (m)', 'Move Minutes', 'Average speed (m/s)'],
    row=['Steps', 'Calories (kcal)', 'Heart Points']
).properties(title=chart_text('Correlation Matrix: Activity Metrics'))

Steps, distance, and move minutes show strong positive correlations, as expected — more time moving generally means more steps and greater distance covered. Heart points show a triangular or arrowhead pattern with move minutes: you can accumulate maximum move minutes (15 per interval) with anywhere from zero to high heart points, but you can't earn heart points without move minutes. This reflects Google Fit's intensity calculation, where sustained moderate-pace walking earns move minutes but not necessarily heart points, while faster or more strenuous activity earns both. Average speed shows the weakest correlations overall, which makes sense given that speed can vary widely regardless of duration or total distance. Time of day doesn't show strong clustering patterns, suggesting activity intensity was fairly consistent whether walking in the morning, afternoon, or evening.

Daily Steps with Trend

To better visualize the variation in daily activity, this chart shows individual step counts as bars alongside a 7-day moving average that smooths out the day-to-day fluctuations and reveals the overall trend across the trip.

# Calculate 7-day moving average
with_trend = by_day.with_columns(
    pl.col('Steps').rolling_mean(window_size=7, min_samples=1).alias('7-Day Average')
)

# Create base chart
base = alt.Chart(with_trend).encode(
    x=alt.X('Day:O', title='Day of January')
)

# Bars for daily steps
bars = base.mark_bar(color='steelblue', opacity=0.7).encode(
    y=alt.Y('Steps:Q', title='Steps'),
    tooltip=['Day:O', 'Steps:Q']
)

# Line for moving average
line = base.mark_line(color='orange', strokeWidth=3).encode(
    y=alt.Y('7-Day Average:Q'),
    tooltip=['Day:O', alt.Tooltip('7-Day Average:Q', format='.0f', title='7-Day Average')]
)

(bars + line).properties(
    width=800,
    height=500,
    title=chart_text('Daily Steps with 7-Day Moving Average')
)

The moving average reveals a remarkably consistent activity level across the trip, hovering around 22,000–23,000 steps per day with only slight dips on the quieter days (Jan 7 and 9). Despite the wide variation in individual daily totals, ranging from under 14,000 to over 32,000 steps, the overall trend line remains steady, suggesting a well-balanced mix of intense hiking days and lighter recovery periods.

Conclusion

I very much enjoyed this vacation. The hiking was physically and sometimes mentally demanding, particularly on days involving river crossings, steep terrain, or unreliable trail markers, but the effort was well worth it. While I felt quite exhausted after some of the longer routes, I typically felt ready to go again the next day, which speaks to the restorative quality of being physically active.

For a more visual sense of the terrain, I've created a 3D Elevation Diary that matches photos I took to the actual topography of the area. If you're planning your own hikes around Nerja and Frigiliana, you can find my routes on Komoot.

If you track your own activity data, I'd encourage you to dig into it. Even simple metrics like daily step counts can reveal patterns you might not notice day to day, and it's a satisfying way to document a trip or training period. The code in this notebook should work with any Google Fit data exported through Google Takeout; just swap in your own CSV files and adjust the date range. Have fun analyzing your data, and thanks for reading!

Published on February 13, 2026 by Ramiro Gómez. To be informed of new posts, subscribe to the RSS feed.
Tags: fitness analytics, health data, data story, jupyter notebook, spain.