Please enable JavaScript.
Coggle requires JavaScript to display documents.
Network analysis of multivariate data in psychological science, Results -…
Network analysis of multivariate data in psychological science
functions of network analysis in the context of three types of application in psycho- logical science
schematic workflow of psychometric network analysis
research question
data collection
time-series data
one or multiple persons characterize multivariate dependencies between time series of variables that are assessed intra-individually (T = large, N ≥ 1).
most often applied in situations where one seeks insight into the dynamic structure of systems.
assist in interpreting intensive longitudinal data by offering insightful characterizations of the multivariate pattern of dynamics.
investigate the impact of reduced social contact due to lockdown measures on the mental health of students. Followed daily for 2 weeks, assessing momentary social contact as well as current stress, anxiety and depression 4 times per day v
a network model can be fitted to these data to investigate to what degree social contact var- iables influence mental health variables over the course of hours and days
design is mixed; in such situations, it is often profitable to use a statistical multi- level approach, in which the repeated observations are treated as nested in the individuals. This explicitly separates individual differences from time dynamics
cross-sectional data
variables measured at a single time point in a large sample (T=1, N= large)
associations between variables are driven by
individual differences,
which renders such networks useful for studying the psychometric structure of
psychological tests
ex:interested in the empirical relations among personality and personal goals.
The
objective
of psychometric network analysis, in this case, would be to offer insight into the multivariate pattern of conditional dependencies that characterize the joint distribution of these variables at these different levels of aggregation
resulting topologies represent structures that describe differences between individuals. =
only inter-individual difference.
note: inter-individual differences do not necessarily translate to intra-individual processes
interested solely in the structure of
individual differences
,
cross-sectional data
are adequate
research into
intra-individual dynamics
ideally complements such data sources with
panel data or time series.
Box 1.Psychometric structure of personality test scores
Research has shown that people’s self-ratings on adjectives (such as outgoing, punctual and nervous) or responses to items that characterize them show systematic
patterns of correlations.
1 more item...
logitudinal (=panel) data
a limited set of repeated measurements characterize both the association structure of variables at a given time point and the way these conditional dependencies’ change over time (N >> T).
illuminate the structure of individual differences and intra-individual change in parallel.
ex.use repeated assessments of emotions and beliefs towards Bill Clinton as represented in longitudinal panel data of the American National Election Studies (ANES)
aim to model consistency, stability and extremity of attitudes towards Bill Clinton during the time that he transitioned from governor of Arkansas to president of the United States.
The net work theory of attitudes (Box 2) formalizes changes in attitude importance as
network temperature
, for example, increasing or decreasing interdependence between attitude elements.
In the panel data example, network analyses can assist in
modelling temperature changes in the interdependence of attitude elements
towards BillClinton.
Box.2 :Causal attitude network model and attitudinal entropy. The network theory of attitudes holds that attitudes are higher-level properties emerging from lower-level beliefs, feelings and behaviours. (attitude <---- beliefs, feelings, behaviors) (different attitude elements= nodes; edges between nodes=potentially bidirectional interactions between the elements.
1 more item...
netwrok analysis
analyze: network structure estimation
node selection
driven by substantive rather than methodological considerations.
edge selection
statistical models
interpret: network description
network stability analysis
The majority of network modelling approaches use
conditional association
A conditional association between two variables holds when these variables are probabilistically dependent, conditional on all other variables in the data
measures
of conditional association depends on the structure of the data
multivariate normal data->>>partial correlations
binary data ->>>logistic regression coefficients
strength
of this conditional association
edge weight
:describes the connection between two nodes
If the association between two variables can be explained by other variables in the network, so that their conditional association vanishes when these other variables are controlled for, then the corresponding nodes are disconnected in the network representation.
statistical model: pairwise Markov random field (PMRF)
The graphic model that describes the joint probability distribution of a set of variables in terms of pairwise statistical interactions
existing statistical methodologies used to estimate PMRF: significance testing, cross-validation31, information filtering and regularized estimation
this Primer is limited to network approaches that use the PMRF, although it should be noted that other approaches to the analysis of multivariate data exist, including models based on zero-order associations37, self-reported causal rela- tions between variables38,39 and relative importance of variables40.
the joint likelihood of multivariate data is modelled through the use of pairwise conditional asso- ciations, leading to a network representation that is undirected
several benefits to the PMRF that make this particular network representation important.
encodes conditional independence relations (in terms of absent links between nodes), which form an important gateway to identify candidate data-generating mechanisms
However, the PMRF does not require an a priori commitment to any particu- lar data-generating mechanism (unlike directed acyclic graph estimation or latent variable modelling, for exam- ple).
place strong assumptions on the structure of the generating model but do hold clues to causal structure through conditional independencies,
well suited to exploratory analyses
estimated PMRFs often describe the data successfully with only a subset of the possible parameters (for example, using sparse network structures), which leads to more insight- ful network visualizations.
a priori commit- ments invariably lead to problems of underdetermination, because many structurally different models will produce indistinguishable data, which is known as statistical equivalence.
By contrast, the PMRF is uniquely identi- fied, so there are no two equivalent PMRFs with different parameters that fit the data equally well.
continuous data
popular type of PMRF
Gaussian graphical model
(also known as a partial correlation network in which edges are parameterized as partial correlation coefficients.
binary data
a popular PMRF developed to estimate the
Ising model
can be used, in which edges are parame- terized as log-linear relationships
The Ising model and the Gaussian graphical model can be combined in mixed graphical models, in which edges are parameter- ized as regression coefficients from generalized linear regression models57.
Mixed graphical models represent the most general approach to PMRF estimation and also allow for the inclusion of categorical and count variables.
Results
time-series data
: To address
time dependencies
,may be extended with
temporal effects
that represent
regressions on the previous time point in a single-person case.
These temporal effects may, for instance, be estimated through the application of a
vector autoregressive model.
The structure of the associations that remain after taking temporal effects into account can also be represented in a PMRF=
contemporaneous network
The temporal network can be read in terms of
carry-over effects
at the timescale defined by the spacing between repeated measures, where the
temporal ordering
can also assist causal interpretation.
The contemporaneous network will include associations that are due to effects that occur at different timescales rather than those defined by the spacing between repeated measurements.
Note that, just as cross-sectional networkstime-series networks almost always represent correla- tional data
can readily be estimated from
cross-sectional data
, assumed that all cases or rows in the data set — which usually represent people — are independent.
This assumption is violated, however, in
panel data
and
time-series designs
, in which an
individual case is not a person
but, rather,
a single measurement moment of one of the persons in the sam- ple
.
conclusion:
whereas cross-sectional data can use independence assumptions that allow for the application of population-sample logic, time-series data require a model to deal with the dependence between data points.
In this case, violations of independence occur in two ways:
and responses from the same person may correlate more strongly with one another than responses between different persons
(for example, a person might feel, on average, very sad in all responses
temporal dependencies are introduced owing to the temporal aspect of data gathering
for example, a person who feels sad at 12:00 might still feel sad at 15:00
panel data
整理
in the cross-sectional case one obtains one network (the PMRF of the association between individual differences)
in the time-series case one obtains two networks (the directed temporal network of vector autoregressive coefficients and the undirected contemporaneous network of the regression residuals)
for multiple time series and panel data one obtains three networks (temporal and contempora- neous networks driven by intra-individual processes and the between-persons network driven by individual dif- ferences).
In addition, one may use multiple time series to identify network structures that are (in)variant over individuals58 or that define subgroups59.
heart of psychometric network analysis
network structure estimation (to construct the network)
network description (to characterize the network)
network stability analysis
(to assess the robustness of results)
重要關鍵字
joint distribution
conditional dependency/conditional association
robustness
topology =Topological structures
A generic term to characterize networks in terms of their global topology, for instance in terms of density or architecture.
interdependence
Network temperature
A parameter of network models that controls the
entropy
of node state patterns.
A network with low temperature will allow only node states that align, such that positively connected nodes must be in the same state and negatively connected nodes must be in the opposite state.
whereas a network with high temperature will allow more random patterns of activation.
low temperature=low entropy 結構性強、預測性強=more interdependent
Ecological momentary assessment
Daily diary methodology to measure psychological states and behaviours in the moment, for instance by using ambulatory assessment devices such as mobile phones to administer questionnaires that probe how the person feels or what the person does at that specific point in time.
undirected
candidate data-generating mechanisms
entropy
the expected value of surprise
high entropy= High expected value of surprise =interdependence(變異解釋率) between nodes is low so the surprise is high 關聯性低、結構性弱、預測性弱; low entropy= low surprise = interdependence high=關聯性越強、結構性強、預測性強
Contemporaneous network
A network that represents within-person conditional associations between variables within the same time point.
often estimated after conditioning on effects of the previous time point, as expressed in a time-series model.
edge selection
A method to determine which edges of a mixed graphical model are to be included and excluded.
regularization-based method
regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting.[2]
non-arbitrary coding
Lambda parameter
edge selection
3 methods
based on model selection through fit indices can be used
regularized estimation procedures16,33 lead to models that balance parsimony and fit, in the sense that they aim to only include edges that improve the fit of the network model to data (for instance, by minimizing the extended Bayesian infor- mation criterion35).
null hypothesis testing procedures are used to evaluate each individual edge for statistical significance30; if desired, this process can be specialized to deal with multiple testing, through Bonferroni correction or false discovery rate approaches, for example.
cross-validation approaches can be used. In these approaches, the network model is chosen based on its performance in out of sample prediction, such as in k-fold cross-validation31.
network description
network descrip- tion tools from network science can be applied to investigate the topology of PMRF networks3,60.
global topology
important in the distinction between sparse vs dense network
sparse networks
few (if any) edges are present relative to the total number of possible edges.
In dense networks
the converse holds, and relatively many edges are present.
important for two reasons:
optimal estimation procedures may depend on sparsity, for example
regularization-based approaches
can be expected to perform well if data are generated from a sparse network, but may not work well in dense networks.
in sparse networks the importance of individual nodes is typically more pro- nounced, because in dense networks all nodes tend to feature a similar large number of edges.
Local topological
properties of networks feature attributes of particular nodes or sets of nodes.
measure of centrality
node strength
, which sums the absolute edge weights of edges per node;
closeness
, which quantifies the distance between the node and all other nodes by averaging the shortest path lengths to all other nodes
betweenness
, which quantifies how often a node lies on the shortest path connecting any two other nodes
Expected influence
a measure of centrality that takes the sign of edge weights into account
this is a new measure specifically designed for the analysis of PMRF structures.
can be appropriate when variables have a
non-arbitrary coding
, such as when the high values of all variables indicate more psychopathology.
application in three areas
personality research
to describe the interaction between stable behavioural patterns that characterize an individ- ual.
An approach that can help is out of sample predictability7
9 goals
identified as relevant for conscientiousness
assess
three main facets
:
industriousness, impulse control and orderliness
.
30 items
from an adjective-based measure of conscientiousness that
attitude research
to model the interac- tion between attitude elements (feelings, thoughts and behaviours) to explain phenomena such as polarization.
mental health research
to represent disorders as systems of interacting symptoms and to represent key concepts such as vulnerability and resilience.
Reproducibility and data deposition
A challenge posed by the estimation of PMRFs from multivariate data
estimation error
and
sampling variation
assessing the stability and accuracy of estimated para- meters and to compare network models of different groups.
data resampling techniques:bootstrapping, permutation tests
robustness
analyses approach involve 3 targets: individual
edge weight estimates, differences between edges in the network, topological metrics defined on the network structure
edge weight estimates
by constructing intervals that reflect the sensi- tivity of edge weight estimates to sampling error,
confidence intervals, credibility intervals and boot- strapped intervals
differences between edge weights
assessed by investigating to what degree the bootstrapped intervals for the relevant coefficients overlap
topological metrics of network properties
node centrality can be investigated through a case-dropping bootstrap,
which progres- sively fewer cases are sampled from the original data set to obtain subsamples; the correlation between central- ity measures in these subsamples and the total sample is plotted as a function of the size of the subsamples
other approaches: bootstrapping, Bayesian statistics
The generalizability of network structures
assessed by comparing results in different samples.
typically assessed by examining the similarity of network structures across samples
A formal test for the invariance of networks
to assess the null hypoth- esis that the networks are identical at the level of the population from which individuals have been sampled
Bayesian analyses86 can also be used to assess invar- iance of networks.
Results