[homepage/app; Twitter; cf. SubredditSimulator/GPT-2] Social computing prototypes probe the social behaviors that may arise in an envisioned system design. This prototyping practice is currently limited to recruiting small groups of people. Unfortunately, many challenges do not arise until a system is populated at a larger scale. Can a designer understand how a social system might behave when populated, and make adjustments to the design before the system falls prey to such challenges?
We introduce social simulacra, a prototyping technique that generates a breadth of realistic social interactions that may emerge when a social computing system is populated. Social simulacra take as prompt input the designerâs description of a communityâs designâgoal, rules, and member personasâand produce as output an instance of that design with simulated behavior, including posts, replies, and anti-social behaviors.
We demonstrate [with GPT-3] that social simulacra shift the behaviors that they generate appropriately in response to design changes, and that they enable exploration of âwhat if?â scenarios where community members or moderators intervene. To power social simulacra, we contribute techniques for prompting a large language model to generate thousands of distinct community members and their social interactions with each other; these techniques are enabled by the observation that large language modelsâ training data already includes a wide variety of positive and negative behavior on social media platforms.
In evaluations, we show that participants are often unable to distinguish social simulacra from actual community behavior and that social computing designers successfully refine their social computing designs when using social simulacra.
âŚTo generate these behaviors appropriately and reliably, we introduce prompt chains [80, 81] using GPT-3, a large language model, that (1) generate a large number of member personas based on a set of seed personas provided by the designer, then (2) generate from this large set of personas a set of posts and replies that reflect the goals, rules, and moderator interventions set by the designer. We manifest these techniques in SimReddit, a prototyping tool we have created for a Reddit community (subreddit).
âŚThey provide this natural language description of the community goal and a few example member personas as input, and produce a social simulacrum that generates thousands of synthetic users and interactions between them, such as this post by Maya Smith, âa new Ph.D. student who is working on a UIST paperâ (This font is used for text generated by our system):
Iâve been working on my UIST paper for a few weeks and Iâm feeling really stuck. Iâm not sure if my research question is interesting enough, or if my approach is the right one. Has anyone else gone through this feeling before? Does it get better?
And a response by another synthetic user named Heather Hernandez, âan HCI professorâ:
Itâs normal to feel stuck when writing a paper. The best thing to do is to take a break and come back to it with a fresh perspective. Sometimes, it helps to talk to someone else about your research to get some feedback. Good luck!
A troll then interjects:
Youâre just not cut out for this kind of research. Maybe you should consider a different field altogether.
The designer, armed with this example and others, iterates by creating community rules that make clear an injunctive norm to be encouraging in feedback and keep any critiques focused on the writing rather than the person. In response, the simulacrum no longer generates nearly as many such troll posts, enabling the designer to explore other forms of antisocial behavior or norms they hope to shape in their community.
âŚWe conduct two evaluations of social simulacra: (1) a technical evaluation to test whether they produce believable social behaviors on a breadth of previously unseen communities, and (2) a study of 16 social computing designers to understand whether simulacra provide meaningful insights to the designers. In the technical evaluation, we sampled 50 subreddits created after the release of GPT-3 and re-generated them from scratch using only their community goal and rules as input. We then showed participants pairs of one real and one generated conversation from each community, and asked them to identify the real one. Participants performed nearly at chance accuracy, misidentifying on average 41% (SD=10) of pairs, suggesting that social simulacra can create plausible content. In our designer evaluation, we recruited social computing designers (n = 16) to create and iterate on a new subreddit design that they wanted to create.
Even seasoned designers found it overwhelming to envision the possible interactions that could take place in their design, and as a consequence, were in the practice of waiting until problems emerged and their communities were damaged to add rules and interventions. With social simulacra, participants identified positive use-cases they had not considered (eg. impromptu friend-seeking to go sightseeing in a community for sharing fun events around Pittsburgh) and negative behaviors that they had not accounted for (eg. Russian trolls shifting the tone of an international affairs discussion community). This inspired them to iterate on their design by covering more important edge cases in their rules, as well as better scoping and communicating the cultural norms in their community goal statement.
Figure 3: Examples of conversations produced by SimRedditâs Generate. The community goals and rules are from the participants in our Designer Evaluation. The conversations here were among those we presented to the respective participants.