Prediction accuracy by cluster — pacc.by.cluster • tripletTools

This function computes how well a participant's held-out judgments are predicted by their own embedding and embeddings from members of the same or other clusters.

Usage

pacc.by.cluster(pacc, clusts, samediff = TRUE)

Arguments

pacc: Participant-by-participant matrix of predictive accuracies of the kind returned by get_prediction_matrix.
clusts: Vector indicating the cluster membership for each participant in pacc
samediff: If TRUE, returns mean prediction accuracy for participants in same cluster vs mean from those in different cluster. Otherwise returns mean prediction accuracy from participants in each cluster.

Value

A participant-by-cluster matrix indicating the mean accuracy predicting each participant's judgments from embeddings in each cluster.

Details

Participants can be clustered based on their pairwise representational distances as returned by get.rep.dist. This function will then compute how well, on average, embeddings within a cluster predict each participant's held-out (test) judgments. It also returns how well the participant's own embedding predicts their held-out judgments. If clusters are useful, the participant's judgments should be better-predicted by their cluster-mates than by non-cluster-mates. If cluster-mates all share the same embedding, same-cluster prediction should be as good a own-embedding prediction.

The first column of the returned matrix is always prediction accuracy from the participant's own embedding. If samediff==TRUE the returned matrix will include mean prediction accuracy from the participant's own cluster and from all participants not in the same cluster. If FALSE, the returned matrix will include one column per cluster, and entries will indicate mean prediction accuracy from the embeddings in that cluster.

Examples

repdist <- get.rep.dist(icon_emb_ind) #Representational distances

#Hierarchical cluster
hc <- hclust(as.dist(repdist), method = "ward.D")
clusts <- cutree(hc, 2) #Cut tree to yield two clusters

pacc <- get.prediction.matrix(icon_emb_ind, icon_triplets) #Prediction matrix
pbc <- pacc.by.cluster(pacc, clusts, samediff=TRUE)

colMeans(pbc)
#>      self      same     other 
#> 0.7821451 0.7758738 0.5957189