Helpful Functions
The following documentation describes helpful utility functions included with this package.
Once you're familiar with them, get involved!
- Make a discussion post introducing yourself and sharing how you're using Epistemic Network Analysis
- File an issue anytime you encounter a bug or are unable to make the package do what you need
statistics
Main.EpistemicNetworkAnalysis.statistics — Functionstatistics(model::AbstractENAModel)Produce a dataframe containing descriptive statistics for each dimension of the model embedding
Example
model = ENAModel(data, codes, conversations, units)
stats = statistics(model)show
Base.show — Methodshow(model::AbstractENAModel)Display text summarizing a model's configuration and summary statistics.
Example
model = ENAModel(data, codes, conversations, units)
show(model)loadExample
Main.EpistemicNetworkAnalysis.loadExample — FunctionloadExample(name::AbstractString)Load an example dataset as a DataFrame
Datasets
loadExample("shakespeare"): Loads the Shakespeare dataset, containing data on two plays, "Hamlet" and "Romeo and Juliet"loadExample("transition"): Loads the Telling Stories of Transitions dataset, containing metadata and codes only, due to the sensitive nature of the underlying textloadExample("efm"): Loads the Doing Evil for Money datasetloadExample("toy"): Loads a minimal toy example, reproduced below
Group,Convo,Unit,Line,A,B,C
Red,1,X,1,0,0,1
Red,1,Y,2,1,0,0
Blue,1,Z,3,0,1,1
Blue,1,W,4,0,0,0
Red,1,X,5,0,0,1
Red,2,X,1,1,0,0
Red,2,Y,2,1,0,0
Blue,2,Z,3,0,1,1
Blue,2,W,4,0,0,0
Red,2,X,5,1,0,0Loading Your Own Data
To load your own datasets, use DataFrame and CSV.File, which requires the DataFrames and CSV packages
using Pkg
Pkg.add("DataFrames")
using DataFrames
Pkg.add("CSV")
using CSV
data = DataFrame(CSV.File("filename_here.csv"))deriveAnyCode!
Main.EpistemicNetworkAnalysis.deriveAnyCode! — FunctionderiveAnyCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)Add a new code column to data, derived from existing codes. The new code will be marked present where any of the old codes are present
Example
deriveAnyCode!(data, :Food, :Hamburgers, :Salads, :Cereal)deriveAllCode!
Main.EpistemicNetworkAnalysis.deriveAllCode! — FunctionderiveAllCode!(
data::DataFrame,
newColumnName::Symbol,
oldColumnNames...
)Add a new code column to data, derived from existing codes. The new code will be marked present only where all of the old codes are present on the same line
Example
deriveAllCode!(data, :ObservingStudentsLearning, :Observing, :Students, :Learning)to_xlsx
Main.EpistemicNetworkAnalysis.to_xlsx — Functionto_xlsx(filename::AbstractString, model::AbstractENAModel)Save a model to the disk as an Excel spreadsheet, useful for sharing results with others
See also serialize for saving models in a more efficient format that can be reloaded into Julia using deserialize
Note: a from_xlsx function does not exist, but is planned. The difficulty is that Excel data is at root a human-readable string format, and some components of some models are difficult to represent reliably as human-readable strings
pointcloud
Main.EpistemicNetworkAnalysis.pointcloud — Functionpointcloud(
model::AbstractENAModel;
ndims::Int=nrow(model.points),
mode::Symbol=:wide,
z_norm::Bool=false,
metadata::Vector{Symbol}=Symbol[]
)Produce a point cloud matrix from a model's plotted points and optional additional metadata columns, for preparing data to pass to other packages, e.g., for machine learning.
Arguments
Required:
model: The ENA model to produce a point cloud from
Optional:
ndims: The number of dimensions from the ENA model's embedding to include in the point cloud. The firstndimdimensions will be included. By default, all dimensions will be includedmode: The orientation of the point cloud, either in:wideformat (default) or:tallformat. In wide format, the point cloud'sXmatrix's rows will correspond to features. In tall format, they will correspond to units.z_norm: Whether to normalize the point cloud's features (default: false)metadata: A list of additional names of metadata columns from the model to include in the point cloud. Note, when including additional metadata, it is advised to also setz_normto true
Fields
Once the point cloud is constructed, it will have the following fields:
X: A matrix containing the point cloud data, in either wide or tall formatfeature_names: A vector of the names of the features included in the point cloud. Whenmodeis:wide,feature_namescorresponds to the rows ofX. Whenmodeis:tall, it corresponds to the columns ofXinstead.unit_names: A vector of the IDs of the units included in the point cloud. Whenmodeis:wide,unit_namescorresponds to the columns ofX. Whenmodeis:tall, it corresponds to the rows ofXinstead.z_normed: A boolean representing whether the point cloud was normalizedz_meansandz_stds: Whenz_normedis true, these are vectors of the original means and standard deviations of the features of the point cloud
Example
# Wide format DataFrame
pc = pointcloud(model)
df = DataFrame(pc.X, pc.unit_names)
# Tall format DataFrame
pc = pointcloud(model, mode=:tall)
df = DataFrame(pc.X, pc.feature_names)
# ndims, metadata, and z_norm
pc = pointcloud(model, ndims=4, mode=:tall, metadata=[:Act], z_norm=true)
df = DataFrame(pc.X, pc.feature_names)