One of the forgotten questions in training is not what exercise to choose, nor how much load to prescribe, but when to change the challenge.
Far too often this decision is left to a pre-set chronology of a few weeks (or days or months), or hard-coded into some systematic, linear training plan. Far less often do we ask whether the athlete’s response to a stimulus has actually changed. If adaptation does not unfold smoothly, but instead reorganizes in phase transitions, that are specific to each individual, then the timing of that change becomes a more important consideration than we might like to admit.
But what to look for when deciding such response-based adaptations… New all-time highs in performance? Sudden loss of performance? Or something more elusive, like a shift in the typical level around which performance fluctuates?
What we observe from session to session are only momentary expressions of how we react to stimulus. Some days the numbers will be higher, some days lower, but most of the time they move within a certain range. When that range itself begins to shift, something more fundamental may have changed.
If human adaptation tends to fluctuate around one level before shifting to another, then single performances tell us very little. Doing something once is an anomaly, not a baseline. Expecting an anomaly to repeat on command is to overestimate the state of the system in front of you.
We would be better off not trying to detect a single personal best, a single peak, or even a trend line that assumes smooth improvement, but what we should be looking for is when the typical state of the system changes, when “normal highs and lows” move to a new level, in short, we need to distinguish between normal fluctuation and genuine change.
This does not require vast datasets or advanced statistical software, but it does require that we look at performance across time, rather than in isolation, and that we think carefully about what we are really measuring, and what it can teach us.

Systems change over time, and the simplest way to think about them is in terms of their state, the set of quantities that determines how the system will respond to change.
Imagine a tank filled with water from which you can open a valve at the bottom and let water flow out. The opening of the valve is a variable: if you turn it to a position, it will display the same degree of opening as the last time you turned it to the same position.
But the level of water in the tank is a state: the effect of opening the valve depends on how full the tank already is. If the level is high, the pressure is greater and the flow behaves differently than if the level is low. So in a sense a state is a variable with a memory, its behavior is not only determined by it’s variables, but also by what it was before that input was applied.
In nature, changes of state are often abrupt rather than gradual. Stable patterns wobble, lose coherence, and hit what scientists call phase transitions, bifurcation points, or simply tipping points. In those moments the system reorganizes into a new pattern of functioning, which may contain a different set of options.
Boiling water is the familiar case. You can heat it steadily and nothing seems to happen. Then, at 100°C, everything changes at once and new possibilities open up: you can now sterilize, steam, cook foods, extract compounds that stay locked away below boiling… The phase change didn’t just give you hotter water, it gave you new possibilities of moving forward.
Humans are part of nature, and behave the same way: long stretches of nothing, followed by sudden reorganizations. Each transition could be a place where the athlete would benefit to be challenged differently, because new possibilities have opened up.

Think of a river flowing down a valley. The river’s level and flow pattern represent the river’s baseline state. The top of the sandbanks are the highs and lows. Over short times, waves rise and fall with rain or drought, and these changes are normal ups and downs.
But a small shift in the channel, something like a fallen tree, or a new sediment bar, or some abnormal change in the weather system that affects a rock formation somewhere, can redirect the entire flow. The river looks much the same, of course, and the surface waves still rise and fall, but behind the scenes, the main current is following a new route.
On the other hand, in aging systems, rigidity and loss of variability are common. The “channel” becomes entrenched and narrow, which is good for some tasks, but bad for adaptability: a single entrenched path means vulnerability when conditions change.
So the detection of both baseline shifts, and the loss of variation seems important, but neither is completely visible by looking at single sessions only. A single good or bad day can mislead. It’s only across several sessions that you start to see whether the system is actually at a different level, or simply fluctuating around the same one.
There are statistical ways of doing this. In fields like finance, engineering, medicine change-point detection models are used to identify when the typical pattern of a time series has shifted enough that yesterday’s baseline no longer applies, and when that happens it’s a strong signal to change treatment, recalibrate the machine, revise the model, or shift the strategy altogether.

But these methods are far too complicated to use for the strength coach. They rely on advanced statistics that even trained mathematicians spend time getting right, and they’re designed for systems where you can collect thousands of observations under controlled conditions. Our data are sparse, messy, and full of context you can’t easily quantify.
So what to do? I have a few suggestions that are good enough for our purposes and cover the situations we coaches usually face.
Obviously before we do calculations, we need data to do them on. The data, for our purposes, needs to be ordered in time, but doesn’t need to be complicated. Actually it’s better if it’s not, as complicated things tend to not get done regularly. Also, for gym work, testing one-rep maxes tends to be physically and mentally demanding, so instead of doing that I would advice doing something that corresponds well with that max performance. Speed of movement at some load tend to work well, as do testing of isometric maximum force.
For speed measurements:
Pick a consistent load (a weight that you deem to be roughly at 70–80% 1RM), and then track velocity over time, or track the load that produces 0.5 m/s. I would most often go for the first option, as finding what load that can be moved at a certain speed requires more lifts.
For force measurements:
Isometric testing allows you to measure force at very specific joint positions. That means you can choose positions that are highly relevant to the sport or movement you care about, but as a general rule, as testing at longer muscle lengths tends to correlate better with dynamic strength, you should, in order to gauge progress in dynamic movements in the training program (as squats, deadlifts etc) often opt do your testing in deeper positions rather than shallower ones (regardless of how your sport looks).

In practice, the two types of tests complement each other:
Isometric tests give you a stable, low-fatigue anchor of maximal force at a position. Velocity tests give you a moving picture of how force is expressed in a movement you are already likely to use in the regular training.
If you work with a closed sport – running, cycling, weightlifting, CrossFit – it’s often possible to collect meaningful data directly from practice. Especially cyclists, with the widespread use of power-meters, often already collect maximal efforts at different thresholds as part of their normal training.
Mathematically this very sport-specific data can be treated similarly once ordered in time, using much the same models to identify shifts in trends. This methodological similarity can fool you to use it in the same way, when there are substantial differences in what you can learn from them, how they should be analyzed and interpreted.
Research consistently shows that relationships between physical training and sports performance are context-dependent. Increases in strength and power in the gym do often, but not always, translate into performance at the field. Conversely, sports performance can improve without similar improvements in metrics tracked in the gym.
Data from the field shows how all the things that make up performance (physical and chemical capacity, technical ability and psychological factors) intertwine into one thing, whereas data from the gym shows you about how one factor might develop, in order to support a whole.
But while tracking data for these not-the-same-but-related signals is tracking two things, they have in their relation lots of information to provide us when it comes to training program design. If both trend upward over time, confidence in what we do in the respective areas of training obviously improves. But if they diverge that can be informative too, and might enable us to sharpen our decisions in the training we perform.

So far we have talked about principles, such as that the personal best we set is not on their own a description of an athletes state, that the data we collect should be simple enough to be collected over time, that we need to consider carefully what it really can tell us, and that statistical tools can help us to make judgments on when the same stimuli might not shake the athlete in the same way as before.
Some days are faster, some slower, and a simple rolling average reduces this volatility by grouping together the data in windows of, let’s say the last five sessions. Very easy to compute, but also means the trend-line can jump abruptly when a particularly strong or weak session drops out, even if the current performance has not meaningfully shifted.

We can see this from one example with bar-speed data of the back squat loaded to 100kg, collected during regular strength training.
The short-term noise in the graph is certainly dampened, but the curve still shows relatively sharp turns. In several places, the average shifts more abruptly than one would suspect the underlying performance state would suggest.
An exponential moving average behaves slightly differently. Instead of discarding older data, it gradually reduces their influence. With the same data as above, the EMA produces a smoother representation of performance, while avoiding sharp shifts in the average.

If biological adaptation rarely shows itself as perfectly smooth, linear increments, but rather appears to remain fluctuating around a mean for a period before shifting to fluctuate around another mean, then a method that filter out daily variability, while still balancing responsiveness to change becomes important.
Conceptually the difference between the two methods is that the rolling average helps us see the recent past more clearly, while the exponential moving average helps us see the present, while still retaining some influence from the past.
For a coach this has practical consequences as training decisions are often based on recent adaptation, not on performance from several weeks ago. A method, like EMA, that balances stability with responsiveness might therefore be more useful to us.
It is worth noting that even this seemingly simple method requires choices, such as what smoothing factor (the weight attributed to more recent results) to use in an EMA, or what size of window in a simple rolling average, and that these choices influence what is shown, and in turn how we might interpret the results.
Rather than just smoothing data, like a moving average does, methods of formal change detection are used to mathematically identify a shift in the underlying mean in a time series of data. For a coach that would mean a method to identify whether what we are seeing is just normal fluctuation, or real change.
Quite advanced statistical methods for change detection exist that can be powerful when large, high-quality datasets are available, but CUSUM modeling is a more simple way of doing this. Instead of smoothing performance, CUSUM accumulates small deviations from a reference level. If performance is consistently a little higher, or lower than usual, the cumulative sum rises or falls. As random fluctuations tend to cancel out, this method picks up sustained changes.
In the coaching environment, where data are sparse and noisy, CUSUM offers a practical middle ground and gives you a relatively advanced method for change detection that is still simple enough to implement without the use of advanced statistical software, as it can be implemented in a more coach-friendly program, like Microsoft Excel.

CUSUM works by accumulating small deviations from a reference level. Instead of looking at each session in isolation, it keeps a running total of how much performance has been above or below what is considered normal. To do this, we first define a reference value, a baseline, which is an average of the first N sessions in the current phase. Then for each session we calculate how far the result is from that reference. In practice two CUSUM lines are usually used, one that tracks upwards shifts, and one that tracks downward shifts.
The method requires one parameter (h) which is the decision threshold, meaning how large the cumulative sum must become before we treat it as a signal of a phase shift. The parameter k is not necessary, but smart for noisy data as ours, as it offers a way to adjust the sensitivity of the method.
CUSUM_pos = MAX(0, previous + (value − reference − k))
CUSUM_neg = MIN(0, previous + (value − reference + k))
If performance fluctuates around the reference level, the cumulative values tend to revert toward zero. If performance consistently deviates in one direction, the cumulative value grows until it crosses the threshold, at which point a phase shift is signaled. A new baseline can then be calculated, and the process is repeated.
When the positive cumulative sum drops below zero, it resets to zero. Similarly, the negative cumulative sum resets toward zero when deviations no longer persist. This prevents isolated sessions from generating lasting signals and ensures that only sustained changes are highlighted.

Once we have identified where sustained shifts in performance occur, we can now represent the data as a series of phases rather than as a continuous trend. Piecewise modeling then allows us to describe what those phases (their respective baseline) look like, by fitting separate segments to periods where performance behaves differently.
When I have taken the time to do this and shown it to the athlete, the response has usually been immediate recognition and acknowledgment that their experience of the training aligns with what they now are able to see.
While detecting (genuine) change is mostly what we coaches need for our decision process, this visual approach makes the training and adaptation very clear to the athlete, and this shared understanding can strengthen buy-in, and helps with long-term motivation for both the athlete and the coach.

When real change occurs, and a new behavior has evolved, not in a psychological sense, but in the sense that the same stimuli now would deviate around a different mean. It is therefore possible that what previously challenged the athlete will no longer push in quite the same way.
When this happens, then, may be an appropriate time to reconsider how we challenge the athlete, not because the previous training was ineffective, but because it may already have done what it could. Detecting these shifts, and interpreting them carefully, is therefore a central but often undervalued part of training design.
Worth noting is that detecting too little variation between sessions can be important as well. A system that is exposed only to a narrow range of demands may appear stable, but stability can be fragile. A river that rarely changes its course can be disrupted by even modest shifts in rainfall, simply because it has not needed to adjust its behavior. In training, a similar lack of variation may lead to a reduced capacity to respond when demands do change, and life tends to throw us curveballs once in a while. It’s good if we can handle those without injury.
Methods such as CUSUM analysis and piecewise modeling make these phase shifts appear clearly, but while they appear more exact they still rely on parameters (thresholds, sensitivity values) that must be set by the coach’s judgment. And is it not so that an experienced coach looking at a well-constructed EMA often will be able to see the same shifts in trend without needing the somewhat heavier statistical machinery needed for even a CUSUM analysis?
That said, there is still value in learning and occasionally applying the more advanced methods. Working through CUSUM or piecewise analyses can train the eye, helping the coach to develop a better sense of what different kinds of change look like in data, and how sensitive different models are to variation. This experience can sharpen interpretation, even when returning to the simpler tools of everyday use to decide on how to structure and set up complementary training in the gym.

Over time, performance in the things we choose to measure in the gym will slow. We cannot change up how we do testing too much, if we do we lose the long perspective and our built up experience of how an athlete performs exactly that movement while the season and life itself changes around them, and with every new exercise there will be a time of learning until the athlete reliably can perform the test. Plateaus in test scores, even occasional set backs, are to be expected, and are not a failure of the system but just how adaptation tends to unfold.
Such plateaus in the gym might actually not be a problem. What we do there is usually not the end goal. It’s almost always complementary, and its purpose being to support performance on the field of play, or for the ordinary trainee to support competence and resilience in daily life. If progress in the gym slows, while sport-specific performance keeps improving, or, for the established athlete, remain at a level that is satisfactory, there is no reason for alarm. The full picture may be reorganizing in ways that are not visible through the simple tests we track in the gym.
How often do I not hear that motivation to keep doing strength work drops because “my squat or deadlift does not improve anymore”, and that therefore that person will “start doing other things, maybe run more”. But if one stops training their strength, those numbers that did not increase will instead tend to decline, and with them then other things, things that was important for this person might become harder, or even impossible?
So stagnation of improvements in a complementary lift does not necessarily signal stagnation in the athlete, and may simply mean that the complementary tool has reached a point of inability to reveal development.

This is why measuring and applying the same analytical methods and models on both domains can be useful, with this we can interpret stagnation in one context differently depending on what happens in the other exactly because they display different things: one is trying to capture the whole, one isolates only a part. This perspective can allow us to remain calm, and reduce the temptation to chase personal bests in the gym at any price. It helps us to ask the more relevant question: is the athlete becoming better at what actually matters?
The task of the coach then becomes simple, but also difficult: not merely to follow the calendar. When doing this we should measure and test gathering data, but also try to not overcomplicate the issue with methods that tempt us with a false sense of certainty.
Because, in the end, the model does not remove the need for interpretation, and in most coaching environments, a clearly ordered dataset, a sensible smoothing method, and a practiced eye will take you most of the way.
Excel file and data used in the article:
- Excel example used in the article for complementary training: https://martinaltemark.fortime.se/wp-content/uploads/training_baseline_templates_dashboard_phase_adaptive_dynamic_CURRENT.xlsx
- Excel example used in the article for cycling performance: https://martinaltemark.fortime.se/wp-content/uploads/wattbike_piecewise_tailmean_CURRENT.xlsx



































