Alternative Social Media Network logo

The Network of Alternative Social Media Researchers

Ethics in Alternative Social Media Research: A Forum

Earlier this month, news broke that an anonymous research team from the University of Zurich had conducted an experiment on Reddit’s “Change My View” subreddit. Briefly, the experiment involved using generative AI to persuade people to change their minds about topics. The researchers revealed their experiment after the fact. There was no informed consent, and the moderators of the “Change My View” (CMV) subreddit were not informed about the experiment until after it was completed. This led to massive condemnation of the experiment as a violation of research ethics: see here, here, and here, for some of the reporting. Hours ago, the CMV moderators posted a note saying the research team has apologized.

Rather than recapitulate the criticism of the University of Zurich AI persuasion experiment on Reddit, here on the Network of Alternative Social Media Researchers blog, we’re going to hold a forum-style conversation. The topic? Lessons we alternative social media researchers can draw from the Zurich experiment on Reddit. Several members of the ASM email list have agreed to take turns addressing this topic. Each person will both respond to the broad question of ASM research ethics as well as to previous parts of the conversation.

The posts:

Robert W. Gehl: Pay Up

When I heard about this experiment, my first thought was that this is merely a peek through the curtain. For all their faults, the Zurich researchers at least revealed what they did. No doubt the researchers thought that the value of their work would outweigh the discomfort of the Redditors under study – that after a bit of shock, the Redditors would agree that the experiment was noble and of course could not have been done with informed consent, so that basic tenet of research was unneccessary. The researchers probably believed their findings would ultimately be welcomed by the Change My View users – and possibly by the rest of us.

The researchers were very, very wrong, of course. But they at least revealed their work.

I’m afraid that for every Zurich team that admits they’ve experimented on social media users, there are ten thousand experiments conducted by marketers, governments, and the social media companies themselves: attempts to see what types of posts get the most interaction, what keywords work best, the best photos and videos to attach, the best ways to customize posts based on psychographics. Few of us are privy to them – they are veiled, locked behind trade secrets and non-disclosure agreements, in service to the predominant political economy of corporate social media: surveillance capitalism (or, as Aral Balkan calls it, “people farming.”) The people doing these research projects learned their lesson from the fallout of the 2015 “emotional contagion” study on Facebook: it’s best to not reveal what you’re up to.

These forms of manipulation are a big reason why activists developed non-corporate alternative social media, and why people have shifted to posting to places such as the fediverse. This obviously has major implications for how we conduct research on alternative social media.

No doubt others in this forum will point to many ways to ethically engage in social research in alternative social media. For my part, I will argue that an examination of the political economy of non-corporate social media – particularly the fediverse – suggests an ethical principle that may not be obvious at first: we need to pay up.

Here’s why. Most of the instances on the fediverse are run on a not-for-profit basis. As I found in my research, the models vary, from informal, “toss me a few dollars to help out” approaches to formally organized non-profits. In all of these cases, people are paying for hosting and bandwidth out-of-pocket – they aren’t funding them by selling ads or selling user data.

In light of that, I would argue that ethical alternative social media research often should include funding or material support provided to the communities under study. For example, if someone has a large grant to study ASM, part of the grant budget should include direct payments to the affected instances. This is especially the case when the research involves bandwidth-heavy tools, like extensive use of APIs. Other forms of support could be help with moderation or co-designed research to solve specific problems faced by ASM admins and members.

Indeed, to circle back to the Zurich Reddit manipulation study, the apology from the researchers to the community included an offer of help to detect AI-generated content. Imagine if the researchers had offered that from the outset. What might have been different?

There’s much more to say, so I will turn things over to our next forum participant, Marc Damie, who is mapping relations between fediverse instances.

Marc Damie: We Need a Fediverse Research API

To understand my answer, you should know a few things about my work. As a PhD student designing privacy-preserving technologies, I want to develop protocols for the Fediverse, which requires understanding its structure. To achieve this, I created simple crawlers to “take pictures” of the Fediverse; i.e., graphs representing interactions between servers. The resulting dataset is already available online, and a paper is on its way.

This Reddit controversy naturally attracted my attention as it shows us what we should not do. However, this controversy is also an extreme example: the researchers actively interferred with real-world human-beings, while most research on social media (like mine) consists in passive data gathering. For example, some researchers have created the “Webis Mastodon corpus”: a corpus of 700M Mastodon posts to foster research in information retrieval. Many Mastodon users were unhappy to learn their posts (potentially containing sensitive information) had been included in a dataset without their approval.

While the Reddit case is a useful starting point, the Webis corpus is more relevant to our discussion. Unlike the clearly unethical study on the “Change My View” subreddit, the Webis corpus occupies a gray area, and it involves alternative social media. This raised a question that has troubled me since beginning my research: Is my work fully ethical?

Before I starting crawling, I spent a year consulting legal teams from my two research institutes. We adopted strict ethical practices: querying only public APIs, slow data scraping, respecting crawler policies, open-sourcing code, and publishing only aggregated data (no personal information). After three months of crawling, I’ve received no complaints; a Mastodon developer even “starred” my crawler’s GitHub repo!

Despite my precautions, ethical ambiguity remains because academia lacks clear guidelines for research on ASM. Existing frameworks for centralized platforms don’t easily apply.

Robert’s point about payment is interesting. While I understand the motivation, I wouldn’t prioritize it yet. Practical implementation seems challenging: should developers be compensated? How to split the payment between instances? Should we pay instances a fixed price independently of their location?

However, it leads me to two follow-up considerations:

  • What do researchers bring to the Fediverse? Funding is one possible contribution, but Fediverse actors might also value the research outcomes. For example, my work could improve the spam detection. This aligns with Robert’s proposal on “co-designed research”. Promoting the potential research output is common for research in collaboration with companies, and it could be imported to research in collaboration with ASM entities.
  • How should we handle research requiring intensive API usage? Some of my crawlers need (moderately) intensive API calls to gather the necessary data. Usually, they gather some data and aggregate them. However, if I had a dedicated API point providing directly the aggregated data, the computation would take a second on the server and a single API call. Research API have (historically) existed on Twitter and were vital for scientific research. I believe the Fediverse may need a research API. I hope that my research may partially demonstrate the interest of such API.

While creating a research API is straightforward on centralized social media, the decentralized nature of ASM introduces technical challenges (that we can reasonably overcome). For such an API to succeed, academic institutions must contribute both financially and technically. This development would also present an opportunity to establish a code of conduct for ASM research.

For example, a Fediverse research API could formally gather consent from Fediverse instances. Currently, researchers rely on public APIs, assuming by default that instances consent to data processing. A dedicated research API would allow instances to actively opt in (or out) of research studies, finally moving ASM researchers out of the ethical “gray area.”

Mareike Lisker: Pay Attention to Instance Norms

Like Marc Damie, I am also interested in the “gray areas.” And, like Marc, I am researching alternative social media as part of my PhD studies, so the question of how to ethically research and collect data in the Fediverse (specifically Mastodon) also applies to me.

Recently, I co-authored a review of the data practices of 29 studies that collected data from Mastodon. The starting point for this review was the fact that several instances explicitly prohibit data gathering in their rules and policies, which is barely reflected in current research. Only a few studies acknowledge, let alone adhere to, the instances’ rules on data gathering.

Although Marc very likely used the term “passive” to contrast it with the intrusive Zurich experiment and express that gathering data on social media does not involve any intervention, I want to challenge the notion that gathering social media data is passive, as this conceals the fact that using an API to gather data is a very active action. The idea of passivity can contribute to the perception that researchers are uninvolved in and unaccountable to the communities they study, in the same manner that using the Mastodon API can flatten the social complexity of networks and alienate researchers from the communities they study.

Another distinction that I would like to bring into the discussion is between legal and ethical considerations. Legally, besides the GDPR that applies to anybody working with data in the EU, the rules and policies of a Mastodon instance are binding only for its registered users. Unregistered users and users from other instances do not have to commit to them. Unless a researcher is by chance registered on an instance from which they gather data, and that instance prohibited it in their policy documents, there is no legal violation. The federated nature of Mastodon only adds complexity to this affair, since when a toot is boosted on another instance, the originating instance’s rules no longer apply.

Ethically, however, researchers—or as a matter of fact the institutional ethics boards overseeing the research—could still feel obliged to individually review the terms and policies of each instance from which they intend to collect data.

Finally, to shift the focus from the philosophical to a more hands-on perspective: I find Rob’s proposal to include financial, material, time or knowledge/co-design support promising, and I will consider how it could be implemented in my project. It might be a worthwhile endeavor to formulate recommendations that take various aspects, such as who and how to “compensate” (developers, admins, moderators, users), possibly dependent on location or size of the server, into consideration.

I support Marc’s idea of a Research API. In our paper, we proposed the technical idea of formalizing instance rules and policies to make them machine-readable and versioned, so that they can be referred to in research. This could be incorporated into a Research API.

What most research on (alternative) social media seem to have in common is that it can happen entirely covertly by default. Once research is published, it has already been conducted and the data gathered. This inherent covertness is something we as researchers need to account for.

Jessa Lingel: Alternative to What? Thinking Through the Ethics of a Slippery Category

When I first started thinking about alternative social media, it felt pretty easy to draw the lines. In the 2010s, a couple key platforms dominated the digital landscape. Anything that wasn’t Facebook or Twitter was, essentially, alternative.

A lot has changed since then. The digital culture landscape is increasingly fragmented and hard to access. Platforms have made it harder to scrape data, which is bad for research, and ultimately bad for making better policy. Yet there’s an upside to the loss of APIs: for a long time, researchers turned to Twitter not because it was the best platform for answering their questions, but because it was the easiest platform to scrape. While the best scenario is that all platforms open themselves up to inquiry, the worst scenario is that platforms are investigated not because they’re interesting or important but because they’re accessible.

Setting aside questions of facilitating access, I want to consider a broader question about ethics of studying alternative social media. What does alternative mean today? Facebook (and its cohort of Instagram and WhatsApp) continues to dominate the landscape, but not with the totality of a decade prior. I think that’s a good thing, but it’s less clear to me what alternative means now. Alternative to what?

In an article with Rosemary Clark-Parsons, we worked through some of the ethical implications around studying alterity. Our starting premise was about the consequences of labeling a community, platform or practice as alternative:

“To approach the study of marginalized or activist media as forms of counterconduct is to position the work of resistance as fundamentally disempowered, and by extension, less legitimate or necessary than the institutions against which resistance works.”

One of the arguments we made is about letting constructs of otherness surface from participants, rather than being imposed as a label. So part of the answer to what makes a platform alternative is whether the users and communities anchoring that platform identify themselves that way. A note of caution that I’m offering here is for researchers to ask whether they’re imposing a label of alterity on what they’re studying, or letting that label surface from that community, because research ethics can change based on the answer.

When I first started writing about alternative social media, I looked to countercultural communities to think about how platforms can be made to fit the needs of those on the margins. My hope was that communities would develop platforms, tools and policies that would better reflect their politics. At times, it feels like the communities that have best exemplified these calls are those with politics I find troubling: Truth Social, Telegram, Gab. Our ethical responsibilities don’t go out the window when writing about views we dislike or even find repugnant. I think Rob makes a fantastic suggestion in terms of compensating instances or providing moderating support. But what does it mean to offer compensation to instances that are anti-feminist, transphobic, or white supremacist?

We need a framework that’s flexible enough to account for multiple forms of alterity. Such a framework needs to start from Mareike’s premise of conceptualizing APIs as passive data collection. The history of social media research is littered with problematic studies that sailed through IRB approvals (or weren’t submitted for IRB at all) in part because the data collection was viewed as passive or publicly available. IRBs are too varied to be reliable in developing guidelines that can be followed across the academy, and they’re also too US-centric. The AOIR ethics guidelines have been a vital document for researchers looking for best practices. As a very practical suggestion, perhaps we can work to have some of these conversations reflected in the next round of guideline revisions? In the meantime, it’s incredibly encouraging for me to see folks working through these questions in the communities they’re studying, with the expertise they’ve developed in a fast-moving digital landscape.

Looking forward to seeing where these conversations take us.

J.J. Sylvia IV: Relational Data Ethics

When a Hugging Face contributor released one-million-bluesky-posts last November, the file was removed within twenty-four hours after backlash erupted — yet a clone with additional data called two-million-bluesky-posts re-appeared on the same service. Less than a week ago, researchers scraped and published two billion Discord messages from 3,167 servers. These examples add to those Marc Damie and others and have discussed thus far.

These episodes, taken together, should worry us, considering Casey Fiesler’s 2018 survey result, which found that 61 % of Twitter users “had no idea” their tweets might be studied and 47 % believed researchers needed permission in order to do so. They also replay the older Tastes, Ties & Time saga: a supposedly anonymized Harvard-Facebook dataset that Michael Zimmer re-identified in days, exposing students who never agreed to participate in research. Put simply, once social data escapes, “removal” is performative. Harvesting data at scale or intervening with bots requires an additional layer of care in our research methods.

Approaching this through a relational research lens would flip the dynamic. Rosi Braidotti’s post-human relational ethics gives us a vocabulary for creating this approach. In her framework, agency is distributed across what she calls transversal alliances of human and non-human actants. Ulrike Schultze and Richard Mason apply a similar insight to internet research, arguing that “the person whose online actions generated the content is inseparable from this digital material.” Our forum has already mapped the terrain for how we might put this theory into practice. Robert Gehl’s “Pay Up” insists researchers owe material support. Marc Damie wants a Fediverse research-API. Mareike Lisker reminds us that scraping is an intervention, not “passive.” Jessa Lingel asks, alternative to what?

A relational approach weaves those threads into a different methodology. Researchers begin as genuine participants in the networks they study—moderating queues, patching code, joining governance calls—so questions arise organically from inside the community rather than parachuting in with a harvest script. Platforms, instances, and individual users can all be given the opportunity to opt in and out of all, or even particular, research projects. Reciprocity becomes ongoing, not transactional. Bandwidth micro-payments could be set up to flow automatically with each API call, mirroring Gehl’s demand, yet expressed as infrastructure rather than charity. Damie’s API vision becomes a community-controlled gateway that issues time-limited tokens; consent isn’t just a static signature. Lisker’s critique of “passivity” evaporates because the crawler now checks a well-known policy file before proceeding and pauses if consent lapses. The participants become co-authors of the research agenda.

This framework offers more than a moral upgrade to Zurich’s rogue persuasion study on Reddit. It proposes a concrete alternative ethos: don’t hide bots to manipulate strangers. Show up with your human account. Shoulder some moderation tickets. Discuss the study plan in the open channel. Let the community decide whether large-scale data pulls are worth the bandwidth or if bot intervention is acceptable. Such a relational practice means slower timelines. Yet the Bluesky, Discord, and Mastodon ethical breaches we’ve considered show the price of speed: ethical debt that keeps accruing in the data mirrors that Fiesler warns about.

Kat Tiidenberg: We Need an ASM Ethics Oversight Board

What an exciting thread to be part of! I’ve tried to braid my thoughts into a couple of points that, I hope, integrate commenting on University of Zurich’s r/changemyview situation, reacting to the others’ thoughts in this thread, and thinking towards ASM research ethics more broadly.

For me, what happened in the case of r/changemyview is a problem that stems from the mainstreaming of procedural ethics (e.g. the ethics of review boards, approvals, etc). The critiques of procedural ethics are well documented – standardization, universalization, bureaucratization, claiming to be able to predict the unpredictable. I am pro ethics oversight, but it does intensely frustrate me, when that becomes – because of how researchers are taught, or because of how oversight boards are framed – the extent of engagement with research ethics.

Mary Midgley, my favorite moral philosopher, has a quote about our (human) “idiotic optimism about choice,” which I think is appropriate here. It signals our penchant to think that we can always choose between a good option and a bad option, where often our choices are between bad and worse. When ethics training and oversight doesn’t factor that in, when we are fed an idea of ethically unambiguous ways of doing research, we get sort of an ethics paralysis in early career researchers, and sort of a semi-belligerent ethics nihilism in some communities of scholars, for whom ethics becomes a hurdle to elegantly hack. This is also how we get gestural ethics that functions with a vocabulary of ‘adhering to standards’ and ‘complying with criteria,’ in ways that are disingenuous in that they are disconnected from what research ethics is all about – beneficence. Informed consent, confidentiality, deidentification are just tools to achieve what we actually want, which is answering our research questions without doing harm; if possible, by doing good.

A fairly popular alternative to procedural ethics (also recommended by the the AoIR ethics guidelines already mentioned in this thread) is ethics of practice or situated ethics. This approach invites us to continually consider the ethical implications of our choices. Being ethical, in this view, is a matter of switched on deliberation about what can be harmful to whom, when and why, and what can be done to minimize harm.

Now, procedural ethics paperwork usually includes questions about possible harm and plans for its mitigation. But, arguably, considerable situational awareness is needed to be able to sincerely fill out those parts of the paperwork. We can’t understand potential harm without understanding what the people in the space we’re about to reimagine as a research site think they are doing. What does it mean for them? How does it relate to who they think they are? This is how we know what they might find disrespectful, creepy, violating and what might harm them and their communities. This idea is, I think, the foundation for JJ’s practical point made earlier in this thread, of researchers needing to begin as genuine participants in the networks they study, instead of parachuting in.

Yet, harm can be an unhelpfully vague category. How much discomfort, unpleasantness, sense of violation qualifies as harm? Here, Annette Markham has suggested the useful heuristic of creepiness. It seems unlikely that the U of Zurich researchers would have answered “no,” to a question like: “do you think redditors might find it creepy when a bot pretending to be a Black person tries to convince them that BLM is bad?” One must assume they didn’t ask. And this, I think, is a product of what can be, with a hat tip to Donna Haraway, called ethics as a ‘gaze from nowhere.’

Harm and/or creepiness as the parameter of research ethics throws up another complication though, which is perhaps particularly relevant to ethics in alternative social media research. Harm to whom? Definition of alternative social media often relates to who owns and develops (and how) the platforms. We can absolutely talk about harm and beneficence to owners or developers – Rob’s very concrete idea of paying up in the first entry of this thread falls into this category; plausibly, Zurich researchers’ offer in their apology letter does that as well. In a way, it can be argued that internet researchers tend to, at least at the level of discourse, want to avoid harm to developers and owners of ASM. However, as Mareike’s review showed, researchers seem to be ignoring the data gathering rules of Mastodon instances in practice. And as Jessa pointed out, it’s muddy here too, because what if they’re alternative, but evil?, In contrast, researchers are much more ambivalent when it comes to dominant mainstream platforms. I am alluding to the researchers building strong critical arguments, legal cases and activist interventions against and tools for getting around the restricted access created by the so called APIcalypse, which indicates the research community’s acceptance of a sort of a pirate ethics in when it comes to dominant mainstream platforms.

But more conventionally, concerns of harm focus on research participants, data subjects, or users in the case of social media. This makes delineating alternative social media research ethics even harder. Does a Facebook user deserve less protection from harm because they’re not enlightened enough to be using Mastodon? Do my selfies, utterances and trace data deserve to be treated differently on one of the platforms I use and not the other? Are we going to convert everyone to ASM by being unethical to them on mainstream platforms? ☺

What then, could I contribute towards our collective ideation of ASM research ethics? First, I think what we’re doing is a great start (thank you Rob), and I wholeheartedly support Jessa’s idea that the next update of AoIR ethics guidelines could include best practices and good ideas for ethically engaging with ASM.

Second (and this might be a bit naïve, but let’s call it speculative future making) expanding on others’ thoughts on more direct collaboration with ASM entities – maybe the checkpoint does not need to sit (only) with university ethics review boards, but with ASM organizations and communities. Perhaps they need ‘ethics ombudsmen’ or some other forms of research liaisons; people who are members of both the ASM and research communities, or have good enough access to both, who can help with creepiness audits. Obviously, the costs (time, skill, money) should be covered by research institutions/projects. While instituting some sort of an overhead system seems bureaucratically cumbersome, and hoping that mainstream platforms become taxable with a social responsibility tax, some of which can be directed into an ethics pot seems overly optimistic, there’s a project specific idea which is inspired by Hanne Stegeman’s brilliant PhD dissertation “Behind the webcam: Contested visibility in online sex work in the Netherlands, Romania and the UK.” Hanne set up a paid sex-worker advisory board for her project.

Could we have platform-native (and project specific?) ethics advisors or boards for ASM?

Post Tags

Comments

For each of these posts, we will also post to Mastodon. If you have a fediverse account and reply to the ASM Network Mastodon post, that shows up as a comment on this blog unless you change your privacy settings to followers-only or DM. Content warnings will work. You can delete your comment by deleting it through Mastodon.

Reply through Fediverse