Journal of Athletic EnhancementISSN: 2324-9080

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Research Article, J Athl Enhanc Vol: 10 Issue: 4

Spatial-Temporal Metrics to Assess Collective Behavior in Football: A Systematic Review and Assessment of Research Quality and Applicability

Martin Corsie1, Thomas Craig1, Paul Alan Swinton1* and Neil Buchanan2

1School of Health Sciences, Robert Gordon University, Garthdee Road, Aberdeen, United Kingdom

2Institute of Sport, PE & Health Sciences, University of Edinburgh, Edinburgh, United Kingdom

*Corresponding Author: Paul Alan Swinton
School of Health Sciences,
Robert Gordon University,
Garthdee Road,
Aberdeen, United Kingdom
Tel: +44-1224-2633- 61;

Received date: May 05, 2021 Accepted date: May 20, 2021 Published date: May 31, 2021

Citation: Swinton P, Corsie M et al., (2021) Spatial-Temporal Metrics to Assess Collective Behavior in Football: A Systematic Review and Assessment of Research Quality and Applicability . J Athl Enhanc 10:8.


Extensive research has been conducted to investigate collective behavior of football players using spatial-temporal data. The purpose of this systematic review was to synthesize and evaluate the applicability of this research by reviewing information presented in previous studies and its capacity to clearly describe the analysis approaches and practical applications of findings. 85 studies were included in the review with approaches assigned to 4 categories of metrics (1: Spaces; 2: Distances; 3: Position; 4: Numerical relations) and 2 analysis methods (1: Predictability 2: Synchronization). The review identified that authors descriptions of metrics generally focused on operationalized definitions and provided limited translation to game scenarios or coaching strategies. Similarly, a substantive percentage of studies (22%) did provide any practical applications, and where these were provided, they were generally broad and provided limited actionable information that could be used directly by practitioners to inform training. Where specific applications were provided these were consistent with a dynamic systems perspective of collective behavior and focused on organismic, environmental and task constraints that could be manipulated. The findings of the present review highlight the innovative practices of the research base and identify several areas for development to increase understanding and uptake in practice.

Keywords: Dynamic systems; Predictibility; Synchronization;Constraints based


Dynamic systems; Predictability; Synchronization; Constraints based


Performance analysis is an evolving discipline of sport science that aims to use innovative approaches to instrument coach decision making and athlete performance [1,2]. Technologies such as Global Positioning Systems (GPS) and semi-automatic video tracking have been used extensively in elite sport for a prolonged period and enable insights into performance to be captured [3,4]. It has also been reported that GPS data have been used primarily to quantify physical outcomes and, to a lesser extent, identify external factors that influence the physical outcomes measured [4]. However, team sports are highly complex and it is recognized that descriptions of simple behaviors such as number of sprints and total distance achieved provide limited insight into the functioning of a team across collective units [5].Contemporary perspectives that view team sports as complex systems identify a need to focus on interactions between players in different match and training settings. Interactions can routinely be described by the positioning and motion of players relative to each other. More abstractly Duarte et al.[6]described sports teams as super organisms composed of teammates continually communicating to help the team function as a unit. Importantly, communication is not limited to routine verbal instruction, but also includes the interrelated dynamics of player motion. Based on these perspectives, player tracking technologies can be used within a systems framework and the generated spatial-temporal data used to provide insight into collective behaviors that may lead to better decision making to improve performance [7].

One sport where spatial-temporal assessment of player collective behavior is developing rapidly is football [2]. Conventional performance analysis approaches such as frequency analysis can be considered simple methods that describe outcomes of collective behavior. However, due to the lack of contextual information describing the underlying processes that led to these outcomes, conventional approaches are limited in their ability to inform decision making [8]. As a result, integration of spatial-temporal data into collective behavior approaches is increasingly being developed to explore models that best describe underlying processes and subsequent outcomes generated. However, with up to twenty-two players plus substitutes participating, a wide range of approaches to quantify and assess coordinated behavior in football exists. This range reflects the diverse research produced by authors over the last decade [9-12], with Sarmento et al. [2] identifying collective behavior analysis and associated metrics as one of the most innovative and important trends for football analysts. The large range of approaches to quantify collective behavior also appears to be influenced by the overarching theoretical framework adopted (e.g. dynamic systems theory, sociobiology) and the specific backgrounds of researchers involved. However, it has been argued that any approach should ultimately be valuable to practitioners and coaches providing relevant information that can be used to improve performance through adaptations to training design or match play [13,2].

To date there has been limited attempt to synthesize research investigating spatial-temporal metrics and assessment of collective behavior in football. The first systematic review was conducted recently by Low et al. [7] and provided a comprehensive overview of the empirical research. A total of 77 studies featuring a mix of observational studies (n=34) and field-based experiments (n=43) with mostly male professionals were included Low et al. [7] identified 27 unique spatial-temporal metrics that were separated into four categories (1: position; 2: Distance; 3: Spaces; 4: Numerical relations). Additionally, the authors’ delineated between linear analysis methods performed on these metrics (e.g. Mean, Standard deviation and Coefficient of variation) and non-linear analysis methods quantifying either predictability (e.g. approximate entropy, sample entropy and dynamic overlap) andsynchronization (e.g. relative phase, cross correlation, cluster phase and vector coding). Finally, the authors reported that investigations of collective behaviors were analyzed at all system levels including the dyadic, sub-group, team, and match levels. Collectively, the review produced by Low et al. [7] provided a clear and effective framework to synthesize findings from what initially could appear to be disparate approaches of individual studies. However, as the primary focus of Low et al. [7] was to develop a framework to describe previous analytical approaches, there was limited synthesis and evaluation of the applicability of the evidence base. Given the complexity of spatial-temporal metrics and non-linear analysis approaches in comparison to conventional performance analysis methods, there is a need for authors’ to effectively describe metrics and analysis approaches, providing context and recommendations so that practitioners and coaches can identify the value of the information and make appropriate decisions. In addition, effective discussion of validity and reliability of different metrics and analysis approaches would further enhance the practical value of the research. Therefore, the purpose of this current systematic review was to synthesize and evaluate the applicability of research investigating spatial-temporal metrics to analyses collective behaviors in football. The review identified and evaluated authors’ descriptions of metrics and clarity provided to facilitate uptake by practitioners. A similar process was included with authors’ discussions of practical applications of study findings. Finally, evaluations of research quality and attempts to address validity and reliability of approaches in primary studies were also included. 


A systematic review of published studies investigating spatial-temporal metrics for collective behavior in football was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). An initial scoping review of the research base was conducted to familiarize authors with the key metrics used and generate appropriate keyword searches. Subsequently, five electronic databases including Sport discuss, embase, Medline, web of Science, and Scopus were searched and considered publications from the 1st of January 2008 to the 20th of February 2019. The search strategy combined two levels, the first included the following terms combined with the Boolean operator OR: ‘Centroid’, ‘Centre of gravity’, ‘Stretch index’, ‘Team spread’, ‘Surface area’, ‘Dominant region’, ‘Approximate entropy’, ‘Relative Phase’, ‘Dyad’, ‘Voronoi’, ‘Coordination’, ‘Patterns of play’, ‘Performance analysis’, ‘Tactical analysis’, ‘Notational analysis’, ‘Group behavior’, ‘Collective behavior’, ‘collection behavior’. Results from the first level were then combined using the and operator with the second level comprising: ‘football’ or ‘soccer’.

Inclusion criteria for retrieved studies included: 1. Participants of any age or sex engaging in football competition or training; 2. Reporting of spatial-temporal metrics comprising at least two position references that described collective behavior; 3. The full publication was available in English. Investigations published as conference abstracts were excluded. Two separate reviewers (MC and NB) screened article titles then abstracts. Full-texts were then read and inclusion criteria applied to complete the 3 stage screening process. Disagreements regarding inclusion were resolved with discussion at the end of the abstract and full-text stages. The primary purpose of the review was to synthesis and evaluates the included research by considering authors’ descriptions of metrics and their discussion of reliability, validity, and practical application of findings. Therefore, data extraction was completed using two different extraction forms. The first extraction included basic information such as population investigated; sample size, specific metrics applied, analysis approach and overarching findings. The individual metrics and analysis approaches that were identified were categorized according to the criteria identified by Low et al. [7]. The second extraction included information regarding authors’ descriptions of included metrics and comments regarding their validity, reliability and practical applications. Direct quotes were extracted from each study and where multiple appropriate quotes were present, all were documented. Quotes regarding practical applications were categorized as either: 1. Broad-generic conclusions providing limited direct applicability; 2. Moderate–conclusions linked to specific game or training aspects providing some direct applicability; 3. Specific–clear recommendations with specific reference to the use of a metric or analysis method providing direct applicability. Extractions were conducted independently by two reviewers (MC and TC) with a final discussion including the categorization of the application of metrics amongst the full research team to ensure consistency.

To evaluate the methodological quality of studies a risk-of-bias quality form was adopted based on a 12-item checklist adjusted from Sarmento et al. [14]. Studies were assessed based on the following criteria: 1. Study purpose; 2. Background literature; 3. Study design; 4. Detail of sample used; 5. Justification of sample size; 6. Identification of ethical approval; 7. Detail of methods; 8. Application of appropriate inferential statistics; 9. Application of relevant analysis methods; 10. Generation of appropriate conclusions; 11. Generation of appropriate practical applications; 12. Acknowledgement of study limitations. A binary scale was used to score each item with the percentage of positive items awarded a unit score to provide an overall quality rating. Three bins were created to group articles into low (≤ 50%), moderate (>50%, ≤ 75%), and high (>75%) quality [15].Data extraction and risk of bias assessment were made in duplicate across three reviewers (MC, NB and TC) with a final discussion amongst the full research team.


The initial literature search identified 2282 studies which were reduced to 1110 after reduplication. Title and abstract screening reduced the number of studies obtained in full-text to 97. A further 12 studies were removed due to metrics not meeting the inclusion criteria specified (7), non-reporting of data (3), and inclusion of sports other than football (2). The 85 included studies comprised a wide range of population groups (Table 1) with respect to game type (Competition, Friendly, Training game, Small Sided Games (SSGs) and 1v1 bouts), Playing level (Professional, Youth, Semi-professional, Amateur, Composite), and country of investigation (Australia, Austria, Brazil, England, Finland, Germany, Italy, Multi-national, Netherlands, Portugal, Spain, Switzerland). The mean number of matches analyzed was 10.6 with a standard deviation of 18.1 (range: 1-103). 31 studies investigated metrics across full 90 min matches, whereas most studies investigated much shorter SSGs. Findings from the studies were varied (Supplemental) and generally focused on relation of metrics to success in terms of offense or defense, or the effect of factors such as Gender, Age, Formations, Tactics, Number of players or Pitch dimension on metric values. (Table 1)

Classification Population type Frequency
Game type Official competition 30
  Friendly 3
  Training Game 9
  Small-sided conditioned game 37
  1v1 bouts 6
Playing level Professional 33
  Youth 38
  Semi-professional 2
  Amateur 9
  Composite 3
Country observed Australia 1
  Austria 1
  Brazil 8
  England 7
  Finland 1
  Germany 2
  Italy 1
  Multi-national 4
  Netherlands 4
  Portugal 18
  Spain 8
  Switzerland 1
  Unspecified 29

Table 1: Summary of study characteristics.

The research quality evaluation (Table 2) identified a single 1.2% “Low quality” study, 30 “moderate quality” studies 35.3% and 54 “high quality” studies 63.5%. The studies were most susceptible to bias through a lack of sample justification with only 5 studies 5.9% stating a reason for the population selected. The research quality evaluation also highlighted that most studies failed to acknowledge study limitations 62.7% and many 21.9% failed to identify practical applications. (Table 2)

Quality Item Success rate
A clearly stated study purpose 100%
Relevant background literature used 100%
Appropriate design for the research question 98.80%
Detailed reporting of the sample size 96.50%
Justification of the sample size 5.90%
Informed consent or ethical permission 87.10%
Detailed reporting of methodology 97.60%
Used inferential statistics related to the aim 89.40%
Appropriate analysis methods considering the study aim 94.10%
Appropriate conclusions stated relating to study methods 96.50%
Stated practical applications derived from study results 63.50%
Acknowledged and described study limitations 37.70%

Table 2: Research quality evaluation.

Where studies did identify practical applications, these were most often categorized as being broad and providing limited clear applicability 54.0%; (Table 3). Examples of practical applications categorized as being of moderate applicability 36.8% generally focused on constraints that could be applied in training including manipulation of pitch size [15]; Formations [16] and SSGs [17,18]. A limited number of practical applications 9.2% were categorized as specific and provided clear recommendations with target values for team [19]; field space [20,21]; and distance between players [22]; that could be directly applied by practitioners and coaches. (Table 3)

Application type Author Year Representative quotes
Broad (generic conclusions providing limited direct applicability) Chung 2019 “This evidence is an important aspect for coaches to consider when planning SSCG tasks, since manipulation of [individual playing area] through different constraints manipulation (i.e., pitch dimension or number of players) promote different contextual information as well as new affordances.
  Clemente 2014c “Players should be repositioned to ensure the in-phase relationship and adjust the distance between the centroids.”
  Coutinho 2019a “Using different pitch configurations might help players to improve their ability to identify the most relevant cues that support the emergence of functional behaviors.”
  Frencken 2013 “Coaches must carefully choose the type of small-sided game in training, as interaction patterns vary depending on pitch dimensions.”
  Goncalves 2018b “Coaches should prepare physical and mental fatiguing practice tasks to increase players ability to adapt and perform under these scenarios.“
  Moura 2012 “Automatic tracking methods during training sessions allow coaches to calculate the same variables proposed in the present research, and based on this information, they can precisely control their players’ organisation on the pitch and systematise tactical strategies.”
Moderate (conclusions linked to specific game or training aspects providing some direct applicability) Castellano 2013 “The surface area may help to explain the defending flow”
  Clemente 2015 “Can be useful information to coaches in order to control the superiority or inferiority zones, reorganizing a team’s strategies according to its weaknesses or strengths"
  Clemente 2013b “The speed and angular positioning of the attacker are key factors when trying to unbalance the attacker- defender dyad.”
  Folgado 2014a “Selecting stronger opponents for matches during the pre-season seems to promote more synchronized behaviors between players.”
  Siegle 2013 “Perturbations can be used to identify playing situations in which one team attacked in a way, which the defending team was not able to answer.”
  Vilar 2013 “This method captures how teams explored different regions to maintain backward stability and create forward instability, in accordance with the shape and location of the area of play.”
Specific (clear recommendations with specific reference to the use of a metric or analysis method providing direct applicability) Aguiar 2015 “For example, in a 3-a side SSG, these distances [player to team centroid] should be around 5 to 6 m and, therefore, require the optimisation from the focus on environmental cues, passing performances and explosive strength and power within these limits.”
  De Souza 2018 “As for practical recommendations of our paper, coaches may create, for instance, 6 × 6 SSGs with the spaces presented in the study: about 23 m in length and 44 m in width, with the objective of motivating players to invade the space.”
  Headrick 2012 “player-to-ball relationships can be used to design practice tasks by positioning the players and ball within critical distances of each other. For example, a practice game could be designed with a D-Ball distance of 2m, representing the range at which the stable state of D-Ball distance appeared in this study.”

Table 3: Summary of practical applications identified in included studies.

Across the 85 studies, 115 unique metrics and analysis approaches were identified across a total sample of 366. A total of 84 (23%) instances were identified where an equation was presented. In contrast, there were 99 (36%) instances where a metric or analysis procedure was reported with no equation or source provided to describe calculations. Similarly, there were 79 (21%) instances of metrics reported with no formal description or comment to provide context or understanding of the purpose of the metric. According to the framework presented by Low et al. [7], the most commonly reported metric category was space metrics (129 instances), followed by distance metrics (110 instances), position metrics (21 instances) and technical-tactical metrics (17 instances). Non-linear analysis methods quantifying synchronization (52 instances) were most frequently applied using the Hilbert transform and at the team level through assessment of team centroids to identify coordinated movement across teams [23,24]. Predictability analysis methods (37 instances) were most frequently applied using Approximate Entropy (ApEn), followed by Sample Entropy (SampEn) and Shannon entropy.

No explicit reference was made to reliability or validity of metrics in any of the included studies. Implicitly, authors assessed the validity of metrics through multiple approaches. The most common was to employ rank-order methods and compare metrics across age groups or playing levels with the implicit assumption that older players and those playing at a higher level or in stronger teams would demonstrate more effective collective behaviors [25-27], Palucci et al. [11] each identified positive relationships between metrics and age with greater width, surface area, team spread, team centroid distance and attack-defensesynchronization with older players. Similarly, Silva et al. [28] identified that players of the same age group but from a higher standard of competition worked together more effectively to explore greater amounts of available space. Additionally, Duarte et al. [29] identified that competition against stronger teams resulted in increased time in synchronized behavior for overall displacements and displacements at higher intensities.


The present study comprised a systematic review of research investigating collective behavior in football through analysis of spatial-temporal data. The review identified that the area is rapidly growing and features a wide range of metrics (Spaces, Distances, Positions and Numerical relations), analysis methods (Synchronization and Predictability), populations (primarily elite level males from U11 to adult) and game scenarios (e.g. competitive matches, SSGs, and 1v1 drills). Focus of the review was placed on authors’ descriptions of the metrics generated and their discussion of reliability, validity, and practical applications of findings. The present review identified several areas for further development of the evidence base to increase the quality of the information and uptake in practice. An initial barrier was identified with regards to authors’ descriptions of metrics with overemphasis placed on operationalized definitions and limited translation to game scenarios or coaching strategies. Additionally, when discussing practical applications, it was identified that authors frequently provided broad statements that restated results and did not provide clear recommendations with guidelines on relevant values or processes that could be used to generate team specific values. The following sections discuss in greater detail the data extracted from the review. 

For practitioners to assess and apply a metric and analysis approach to their own data, an understanding of what the approach measures and how it relates to performance is required. However, review of the included studies identified that most authors’ descriptions lacked conceptual overview, and instead focused exclusively on operationalized definitions. Common examples across the metric categories included: Spaces (surface area): “The convex hull formed by positions of the players in each team” [11]; Distance (distance between centroids): “The difference, longitudinally and laterally, between teams centroid positions” [30]: Position (relative angle): “The relative angle (α) between the center of goal, defender and attacker” [31]; and Numerical relations (Space Control Gain): “Measured by the difference of space control percentage between pass initiation and pass completion modeled by utilizing coronoid diagrams of the pitch at each time frame”[32]; Similarly, descriptions regarding the two main analysis methods provided limited conceptual understanding with representative examples including: Synchronization (relative phase): “The relative phase of the time series corresponding to speed displacements of all dyads” [15]; and Predictability (ApEn): “Measure was used to assess the complexity of the particular collective behaviors” [33]; In contrast, there were a limited number of examples where metric descriptions also provided conceptual detail to relate to aspects of football: “The stretch index measures the compactness of a team on a given moment” [34]; “(Effective playing space)was calculated as the surface area of the convex hull of all players (Excluding goalkeepers) as a measure of the playing area used by the players in a given situation” [32]; “(surface area) This variable expresses the relationship between the tactical forms (shapes) adopted and spaces exploited by both teams, to support analysis of how they varied over time”[25]. In addition, primarily for numerical relations metrics there were examples where authors attempted to add context with regards to tactics and philosophy: “(Offensive space ratio) The aim of this principle is to reduce the concentration of opponents in their central zone, thus attempting to open up some spaces to penetrate” [35]; “(Pressure passing efficacy) aims to measure high quality through-balls by weighing passes with more than one outplayed opponent by the pressure on both pass initiator and receiver” [32]. Whilst operational definitions and associated equations are important to ensure consistency across analyses, authors should seek to combine this information with greater context as demonstrated by these latter examples.

The review of included studies identified no explicit reference to reliability or validity of metrics and analysis approaches generated. Several authors referred to the reliability of instrumentation used to collect position measures [11,31,17]; however, no study investigated the extent to which noise in data influenced reliability or consistency of values generated within for example single sessions (where tactics may be expected to be somewhat consistent). Metrics and analysis approaches that are unduly influenced by noise or vary substantially across time periods where collective behaviors are expected to be consistent should not be recommended. Across the included studies a wide range of systems measuring position coordinates were identified with variation in accuracy expected. As a result, future research should seek to identify the positional accuracy required to generate reliable data across different metrics and analysis approaches to inform practice and measurement systems used. 

Whilst no explicit references were made in the included studies to validity, implicit attempts to assess validity were made using rank order methods to compare for example metrics across age groups [25,36,26,27] with the reasonable assumption that older players would demonstrate more effective collective behaviors. e.g. Barnabe et al. [25] identified significant differences in the surface area among U16, U17, and U19 teams. Similarly, Olthof et al. [26] found significant differences in U17 and U19s lateral stretch index. However, for collective behavior approaches to be more widely used by practitioners there is need for future research to establish the sensitivity of approaches to distinguish between strong and weak teams within leagues or distinguish between good and bad performances within a single team.

In addition to rank order methods, included studies also implicitly investigated validity by assessing whether approaches to assessing collective behavior demonstrated longitudinal patterns that would be expected with regards to fatigue or increased experience. Goncalves et al. [15] observed the variation in teammate dyad synchronization over 51 matches. Synchronization of players increased when walking and decreased when joggingand running as the match progressed. Moreover, coefficient of variation values identified greater variation in jogging and running synchronization as matches progressed which the authors attributed to mental fatigue. In the context of increased experience Folgado et al. [37] investigated synchronous behaviors of a professional football team from the beginning of pre-season to the end of pre-season during 9v9 matches. Analyses identified an increase in synchronization between the first and last training sessions consistent with the hypothesis that familiarity and greater synchronized behaviors can emerge as team-mates obtain greater experience playing with each other. These and similar results generated highlight that tempo-spatial metrics may be sensitive to a range of external factors relevant to football. However, the results also highlight potential variability in metrics within and between games, which may have to be accounted for by practitioners when profiling data obtained.

A limited number of the studies included in this review also attempted to assess whether spatial-temporal metrics could predict critical events Moura et al. [38] applied vector coding to team spread and identified differences between inter-team coordination preceding shots on goal, and defensive tackles. The authors reported that attacking plays that ended in shots on goal presented greater anti-phase patterns in the early stage of the possession and that teams should attempt to present contrary behaviors to their opponent as soon as ball possession is regained. In a more focused aspect of game play, Shafizadeh et al. [39] analyzed 1v1 situations between strikers and goalkeepers in the English premier league. Results demonstrated that interpersonal distance and relative velocity between attacker and goalkeeper in the longitudinal direction influenced the probability of a goal being scored or not. Whist these findings and others may provide initial information regarding collective behaviors preceding important events, a challenge for researchers and practitioners is to determine what football related strategies should be employed when limitations are identified. Given the complexity of sport such as football where multiple confounding factors limit clear explanations [40], development in this area may require greater integration between metrics and football strategies or identification of key constraints that can be manipulated to alter behaviors.

Based on recognition that collective behavior analyses must ultimately provide practitioners with information to improve performance through adaptations to training design or match play [13,2], clear discussion of practical applications of findings from research is required. Review of the included studies identified that a substantive portion (21.9%) did not make any direct reference to practical applications. Additionally, it was identified that when practical applications were made by authors these were most often broad, reiterating results of the study in slightly different contexts that demonstrated limited actionable qualities. Representative examples of broad practical applications (Table 3) included: “Varying individual playing area by manipulating number of players or by manipulating pitch dimension possesses different implications on emergent teams’ behavioral patterns. Therefore, this evidence is an important aspect for coaches to consider” [41]. “The results of this study can provide valuable tools for controlling player organization on the pitch” [42]. “The manipulation of informational constraints to shape tactical behavior may be an asset” [43]. “The phase couplings and other spatial-temporal relations among players and teams reflects their tactical performances and should consequently be considered by coaches in the design and implementation of small-sided games.” [44]. Collectively, examples of broad practical applications identified to practitioners and coaches procedures that may have potential to improve collective behaviors but provided limited information to enact changes in training or competition. Additionally, when combined with metrics and analysis procedures that are not well contextualized with regards to football specific actions or well understood tactics, broad practical applications made by researchers are unlikely to be implemented.

The next most common categorization of practical applications was moderate where authors linked conclusions to specific aspects of competitive games or training drills. Consistent with the dynamic systems approach, many practical applications categorized as moderate focused on different types of constraints as a means of altering collective behaviors. Examples of manipulating organismic constraints were proposed by Folgado et al. [37,45] who recommended altering team make up of stronger and weak opponents in training, with drills featuring stronger opponents tending to increase synchronization. In contrast, Coutinho et al. [12] identified that manipulating environmental constraints in the form of spatial references with additional pitch lines altered collective behaviors. Inclusion of reference lines was found to increase team defensive behavior with lower approximate entropy values and was recommended for situations where more structured patterns of play were desired [12]. In contrast, removal of reference lines was found to increase players’ movement synchronization which was recommended for developing offensive movement patterns through lower structured playing styles to create instabilities [46]. Finally, multiple authors identified task constraints that were categorized as moderate practical applications focusing on SSGs. Manipulations of formations [16] and overload situations [17,18] were identified as strategies to enhance both effective attacking and defensive behaviors. Whilst practical applications identified as moderate provide coaches and practitioners with clearer guidance on potential constraints to manipulate and analysis procedures to adopt, the examples did not provide guidance on values expected and what may represent substantive changes. One of the most consistent conclusions across the research base was that greater synchronization in spatial-temporal metrics tended to reflect more effective behaviors [18,36,47-50]. However, recommendations and guidance identifying expected changes in values for different metrics may be required to facilitate greater uptake in practice.

In a small number of instances, it was identified that authors provided specific practical applications that included recommendations of metric values that were linked to relevant training situations and goals. Aguiar et al. [19] recommend that players should be approximately 5-6 m from their team centroid during 3-a side SSGs to enhance availability of environmental cues and ability to pass effectively. Similarly, Gonclaves et al. [15] recommended that players adopt approximately 12m2 effective playing space for actions involving three team mates to enhance availability of environmental cues and control players’ positioning while defending. Additionally, multiple authors provided specific recommendations regarding pitch dimensions to enhance collective behaviors described by distances and positional measures in youths Castellano et al. [11] and female (Zubillaga) players. In contrast to researchers providing specific recommendations regarding metric values, it has been suggested that practitioners and coaches set their own values based on their philosophy and tactical approach to matches [51]. Additionally, it is recognized that given the extent to which the spatial-temporal research base is developing with novel metrics and analysis approaches that many studies are explorative and clear practical applications should not be expected. However, transfer of the approaches from the research domain to practical use will require more specific recommendations linked to football specific concepts, or clearer guidance on procedures that can be used by coaches and practitioners to develop their own values and monitoring processes.

Based on the risk of bias assessment, one of the key weaknesses in the included studies was authors’ lack of justifying sample size, with only 5.9% of studies providing justification. The number of matches investigated in the included studies ranged from 1 to 103. Most studies investigated fewer than 10 matches, with similar metrics used for both analysis of 11v11 matches and SSGs. These findings suggest that most studies feature convenience samples rather than identifying likely effect magnitudes and performing analyses to identify sample sizes required to obtain appropriate statistical power. Alternatively, considerations could be made on the data an individual team might have available and justify the sample by grounding the research in a practical context. Broader consideration of populations investigated in the research identified a substantial skew towards male players, with only two studies incorporating females [52,53]. It is unclear whether collective behavior as assessed by spatial-temporal metrics would be different between males and females. Tenga et al. [52] identified similarities between the playing length and width of both male and female players. However, male players demonstrated higher levels of variation which was suggested to aid in creation of more space and passing opportunities. Further analysis should be conducted to identify clearer differences across a range of metrics between males and females. If clear differences are identified then further investigation in collect behavior in women’s football must be executed to provide gender disaggregated data.

A key area for future investigation that may assist with practitioners adopting assessment of collective behaviors includes addressing the link between competitive matches and training. Matches provide information on team dynamics within the performance context, however, official competitions are relatively fixed, whereas during training sessions, coaches are free to make large changes to constraints in attempts to alter behaviors and generate effective team dynamics. The information obtained during training sessions could then be used to inform strategies during matches and determine whether similar changes to team dynamics emerge. To date, studies have identified that subtle differences in collective behavior metrics can be obtained by manipulating constraints such as pitch dimensions, number of players and player formations [11,12,17,19,44,54-57]. However, due to the lack of understanding of ideal values for metrics, it is unclear whether these adaptations are desirable [57-71]. Moreover, understanding how these manipulations translate from training into matches is a further abstraction that at present there is no evidence for [72-99].


There has been a substantial increase in the number of studies investigating collective behaviors using positional data in football over the last decade. Only 23 studies matching these criteria were identified between 2008 and 2013, whereas 62 studies were identified between 2014 and 2019. Additionally, more recent studies have more frequently included numerical relations (16/17), synchronization (45/52) and predictability (32/37) metrics than studies prior to 2014. Across the included studies many metrics were analyzed using a range of approaches producing extensive areas for future research and practitioners to implement. Whilst the research base highlights that collective behavior analysis through spatial-temporal data may provide unique insights into performance in football, there are limitations and gaps in understanding that currently prevent the widespread use of the approaches in practice. Common limitations acting as barriers to implementation include reliance on purely mathematical descriptions of metrics at the expense of clear conceptual descriptions. Similarly, a lack of detailed practical applications including normative data and clear guidance on how player position and relative movement are best manipulated to simultaneously improve metric values and performance currently limits uptake. Greater conceptual clarity of metrics may be obtained by researchers incorporating the views and playing principles of coaches to align or adjust metrics. This process may enhance coach buy-in and as a result the likelihood of performance analysts conducting collective behavior analyses as part of their reporting to coaches. In contrast, progressing to the stage where clear practical recommendations can be made is likely to require substantially more research. Important aspects for future research to consider include the assessment of reliability; establishing whether collective behavior metrics are sensitive enough to explain performance differences between teams and stronger performances within a team; and whether manipulations in training can create changes in collective behaviors and their associated metrics that transfer to competition.


international publisher, scitechnol, subscription journals, subscription, international, publisher, science

Track Your Manuscript

Awards Nomination