#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# paragraphizer.py: reformat a single paragraph into multiple paragraphs using OpenAI API neural nets
# Author: Gwern Branwen
# Date: 2022-02-18
# When:  Time-stamp: "2024-12-03 20:14:25 gwern"
# License: CC-0
#
# Usage: $ OPENAI_API_KEY="sk-XXX" echo [...] | python paragraphizer.py
#
# Paragraphizer attempts to reformat a single run-on paragraph into multiple shorter paragraphs,
# presumably split by topic. This is particularly useful for research paper abstracts, which are
# usually written in a sequential fashion (along the lines of 'Background / Question / Data /
# Methods / Results / Conclusion') but not always formatted in topic-separated paragraphs. A
# jargon-heavy run-on abstract can be near-impossible to skim.
#
# Paragraphizer does this by a call to the OA API; I have found that a simple 'rewrite this as'
# zero-shot prompt works well with davinci-instruct models (and is unreliable with smaller models or
# plain davinci). The main failure mode is that it will not copy the abstract exactly, and may
# reword or expand on parts, which is highly undesirable, and would mean that it cannot be used to
# automatically reformat abstracts. (And if you aren't going to use Paragraphizer automatically, why
# bother? It doesn't take long to add linebreaks by hand.) That failure mode can be removed by
# simply checking that after removing the new newlines, it equals the original input (ie. the *only*
# difference is the inserted newlines). The result can still be bad but it's probably at least
# better.
#
# Example:
#
# $ xclip -o
# Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior
# policies or value functions via gradient updates. While effective, this approach has several
# disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate
# experiences into the parametric model, (3) experiences that are not fully integrated do not
# appropriately influence the agent's behavior, and (4) behavior is limited by the capacity of the
# model. In this paper we explore an alternative paradigm in which we train a network to map a
# dataset of past experiences to optimal behavior. Specifically, we augment an RL agent with a
# retrieval process (parameterized as a neural network) that has direct access to a dataset of
# experiences. This dataset can come from the agent's past experiences, expert demonstrations, or
# any other relevant source. The retrieval process is trained to retrieve information from the
# dataset that may be useful in the current context, to help the agent achieve its goal faster and
# more efficiently. We integrate our method into two different RL agents: an offline DQN agent and
# an online R2D2 agent. In offline multi-task problems, we show that the retrieval-augmented DQN
# agent avoids task interference and learns faster than the baseline DQN agent. On Atari, we show
# that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and
# achieves higher scores. We run extensive ablations to measure the contributions of the components
# of our proposed method.
#
# $ OPENAI_API_KEY="sk-XYZ" xclip -o | python paragraphizer.py
# Most deep [reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning) (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate experiences into the parametric model, (3) experiences that are not fully integrated do not appropriately influence the agent's behavior, and (4) behavior is limited by the capacity of the model.
#
# In this paper, we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior. Specifically, we augment an RL agent with a retrieval process (parameterized as a neural network) that has direct access to a dataset of experiences. This dataset can come from the agent's past experiences, expert demonstrations, or any other relevant source. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context, to help the agent achieve its goal faster and more efficiently.
#
# We integrate our method into two different RL agents: an offline [DQN](https://en.wikipedia.org/wiki/Q-learning#Deep_Q-learning) agent and an online [R2D2](https://openreview.net/forum?id=r1lyTjAqYX) agent. In offline multi-task problems, we show that the retrieval-augmented DQN agent avoids task interference and learns faster than the baseline DQN agent. On [Atari](https://en.wikipedia.org/wiki/Atari), we show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
#
# We run extensive ablations to measure the contributions of the components of our proposed method.

import sys
from openai import OpenAI
client = OpenAI()

if len(sys.argv) == 1:
    target = sys.stdin.read().strip()
else:
    target = sys.argv[1]

completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {"role": "system", "content": "You are a helpful research assistant."},
      {"role": "user", "content":
f"""Task: reformatting abstracts.

Summary: Add relevant HTML hyperlinks & formatting to text, and adds double-newlines to split abstracts into Markdown paragraphs (one topic per paragraph.)

Task description: Please process the following abstract (between the '<abstract>' and '</abstract>' tags), by adding double-newlines to split it into paragraphs (one topic per paragraph.) The order of topics should be: 1. background/introduction; 2. methods/data/approach; 3. results/benchmarks/outputs; 4. conclusion/discussion/implications; 5. supplementary information (eg. URLs, code, websites, datasets).

Additional formatting instructions: convert to American spelling & conventions. Do not add unnecessary italics; but italicize species names as appropriate. If a new term, concept, or system is introduced by this research paper, bold the first appearance using '<strong>NAME</strong>' formatting (and ONLY the first use), and bold only the most important new term. Please also add useful hyperlinks (such as Wikipedia articles) in HTML format to technical terminology or names (but do not hyperlink obvious familiar terms like "University" or "psychology"); do not duplicate links: include each link ONLY once; include only URLs you are sure of. Please include ONLY the resulting text with hyperlinks in your output, include ALL the original text, and include NO other conversation or comments.

Examples:

- Input: <abstract>Previous theoretical results pertaining to meta-learning on sequences build on contrived assumptions and are somewhat convoluted. We introduce new information-theoretic tools that lead to an elegant and very general decomposition of error into 3 components: irreducible error, meta-learning error, and intra-task error. These tools unify analyses across many meta-learning challenges. To illustrate, we apply them to establish new results about in-context learning with transformers. Our theoretical results characterizes how error decays in both the number of training sequences and sequence lengths. Our results are very general; for example, they avoid contrived mixing time assumptions made by all prior results that establish decay of error with sequence length.</abstract> →
Previous theoretical results pertaining to meta-learning on sequences build on contrived assumptions and are somewhat convoluted.
We introduce new information-theoretic tools that lead to an elegant and very general decomposition of error into 3 components: irreducible error, meta-learning error, and intra-task error. These tools unify analyses across many meta-learning challenges.
To illustrate, we apply them to establish new results about in-context learning with transformers. Our theoretical results characterizes how error decays in both the number of training sequences and sequence lengths.
Our results are very general; for example, they avoid contrived mixing time assumptions made by all prior results that establish decay of error with sequence length.
- Input: <abstract>Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions; and compare to a baseline of direct question-answering, where the judge just answers outright without the AI. We use large language models (LLMs) as both AI agents and as stand-ins for human judges, taking the judge models to be weaker than agent models. We benchmark on a diverse range of asymmetries between judges and agents, extending previous work on a single extractive QA task with information asymmetry, to also include mathematics, coding, logic and multimodal reasoning asymmetries. We find that debate outperforms consultancy across all tasks when the consultant is randomly assigned to argue for the correct/incorrect answer. Comparing debate to direct question answering, the results depend on the type of task: in extractive QA tasks with information asymmetry debate outperforms direct question answering, but in other tasks without information asymmetry the results are mixed. Previous work assigned debaters/consultants an answer to argue for. When we allow them to instead choose which answer to argue for, we find judges are less frequently convinced by the wrong answer in debate than in consultancy. Further, we find that stronger debater models increase judge accuracy, though more modestly than in previous studies.</abstract> →
Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions; and compare to a baseline of direct question-answering, where the judge just answers outright without the AI. We use large language models (LLMs) as both AI agents and as stand-ins for human judges, taking the judge models to be weaker than agent models.
We benchmark on a diverse range of asymmetries between judges and agents, extending previous work on a single extractive QA task with information asymmetry, to also include mathematics, coding, logic and multimodal reasoning asymmetries.
We find that debate outperforms consultancy across all tasks when the consultant is randomly assigned to argue for the correct/incorrect answer.
Comparing debate to direct question answering, the results depend on the type of task: in extractive QA tasks with information asymmetry debate outperforms direct question answering, but in other tasks without information asymmetry the results are mixed.
Previous work assigned debaters/consultants an answer to argue for.
When we allow them to instead choose which answer to argue for, we find judges are less frequently convinced by the wrong answer in debate than in consultancy. Further, we find that stronger debater models increase judge accuracy, though more modestly than in previous studies.
- Input: <abstract>If an individual entity endures a fixed probability μ &lt;1 of disappearing (“dying”) in a given fixed time period, then, as time approaches infinity, the probability of death approaches certainty. One approach to avoid this fate is for individuals to copy themselves into different locations; if the copies each have an independent probability of dying, then the total risk is much reduced. However, to avoid the same ultimate fate, the entity must continue copying itself to continually reduce the risk of death. In this paper, we show that to get a non-zero probability of ultimate survival, it suffices that the number of copies grows logarithmically with time. Accounting for expected copy casualties, the required rate of copying is hence bounded.</abstract> →
If an individual entity endures a fixed probability μ &lt;1 of disappearing (“dying”) in a given fixed time period, then, as time approaches infinity, the probability of death approaches certainty.
One approach to avoid this fate is for individuals to copy themselves into different locations; if the copies each have an independent probability of dying, then the total risk is much reduced. However, to avoid the same ultimate fate, the entity must continue copying itself to continually reduce the risk of death.
In this paper, we show that to get a non-zero probability of ultimate survival, it suffices that the number of copies grows logarithmically with time. Accounting for expected copy casualties, the required rate of copying is hence bounded.
- Input: <abstract>We present a framework for translating unlabeled images from one domain into analog images in another domain. We employ a progressively growing skip-connected encoder-generator structure and train it with a <a href="https://en.wikipedia.org/wiki/Generative_adversarial_network">GAN</a> loss for realistic output, a cycle consistency loss for maintaining same-domain translation identity, and a semantic consistency loss that encourages the network to keep the input semantic features in the output. We apply our framework on the task of translating face images, and show that it is capable of learning semantic mappings for face images with no supervised one-to-one image mapping.</abstract> →
We present a framework for translating unlabeled images from one domain into analog images in another domain.
We employ a progressively growing skip-connected encoder-generator structure and train it with a <a href="https://en.wikipedia.org/wiki/Generative_adversarial_network">GAN</a> loss for realistic output, a cycle consistency loss for maintaining same-domain translation identity, and a semantic consistency loss that encourages the network to keep the input semantic features in the output.
We apply our framework on the task of translating face images, and show that it is capable of learning semantic mappings for face images with no supervised one-to-one image mapping.
- Input: <abstract>We introduce a new resource: the SAYCam corpus. Infants aged 6–32 months wore a head-mounted camera for ~2 hours per week, over the course of ~two and a half years. The result is a large, naturalistic, longitudinal dataset of infant-perspective and child-perspective videos. Transcription efforts are underway, with over 200,000 words of naturalistic dialogue already transcribed. Similarly, the dataset is searchable using a number of criteria (eg. age of participant, location, setting, objects present). The resulting dataset will be of broad use to psychologists, linguists, and computer scientists.</abstract> →
We introduce a new resource: the SAYCam corpus.
Infants aged 6–32 months wore a head-mounted camera for ~2 hours per week, over the course of ~two and a half years.
The result is a large, naturalistic, longitudinal dataset of infant-perspective and child-perspective videos. Transcription efforts are underway, with over 200,000 words of naturalistic dialogue already transcribed. Similarly, the dataset is searchable using a number of criteria (eg. age of participant, location, setting, objects present).
The resulting dataset will be of broad use to psychologists, linguists, and computer scientists.
- Input: <abstract>Subreddit devoted to discussion of <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a> research and projects, particularly deep reinforcement learning (more specialized than <code>/r/MachineLearning</code>). Major themes include deep learning, model-based vs model-free RL, robotics, multi-agent RL, exploration, meta-reinforcement learning, imitation learning, the psychology of RL in biological organisms such as humans, and safety/AI risk. Moderate activity level (as of 2019-09-11): ~10k subscribers, 2k pageviews/daily</abstract> →
Subreddit devoted to discussion of <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a> research and projects, particularly deep reinforcement learning (more specialized than <code>/r/MachineLearning</code>).
Major themes include deep learning, model-based vs model-free RL, robotics, multi-agent RL, exploration, meta-reinforcement learning, imitation learning, the psychology of RL in biological organisms such as humans, and safety/AI risk.
Moderate activity level (as of 2019-09-11): ~10k subscribers, 2k pageviews/daily
- Input: <abstract>Large transformer models have shown extraordinary success in achieving state-of-the-art results in many natural language processing applications. However, training and deploying these models can be prohibitively costly for long sequences, as the standard self-attention mechanism of the <a href="https://arxiv.org/abs/1706.03762#google" title="‘Attention Is All You Need’, Vaswani et al 2017">Transformer</a> uses <em>𝒪(n<sup>2</sup>)</em> time and space with respect to sequence length. In this paper, we demonstrate that the self-attention mechanism can be approximated by a low-rank matrix. We further exploit this finding to propose a new self-attention mechanism, which reduces the overall self-attention complexity from <em>𝒪(n<sup>2</sup>)</em> to <em>𝒪(n)</em> in both time and space. The resulting linear transformer, the <strong>Linformer</strong>, performs on par with standard Transformer models, while being much more memory-efficient and time-efficient.</abstract> →
Large transformer models have shown extraordinary success in achieving state-of-the-art results in many natural language processing applications. However, training and deploying these models can be prohibitively costly for long sequences, as the standard self-attention mechanism of the <a href="https://arxiv.org/abs/1706.03762#google" title="‘Attention Is All You Need’, Vaswani et al 2017">Transformer</a> uses <em>𝒪(n<sup>2</sup>)</em> time and space with respect to sequence length.
In this paper, we demonstrate that the self-attention mechanism can be approximated by a low-rank matrix. We further exploit this finding to propose a new self-attention mechanism, which reduces the overall self-attention complexity from <em>𝒪(n<sup>2</sup>)</em> to <em>𝒪(n)</em> in both time and space.
The resulting linear transformer, the <strong>Linformer</strong>, performs on par with standard Transformer models, while being much more memory-efficient and time-efficient.
- Input: <abstract>[Book review of an anthropologist text arguing for imitation and extensive cultural <a href="https://en.wikipedia.org/wiki/Group_selection">group selection</a> as the driving force of human civilization, with imitation of other humans being the unique human cognitive skill that gave us the edge over other primates and all animals, with any kind of raw intelligence being strictly minor. Further this extensive multi-level group selectionism implies that most knowledge is embodied in apparently-arbitrary cultural practices, such as traditional food preparation or divination or hunting rituals, which are effective despite lacking any observable rationale and the actual reasons for their efficacy are inaccessible to mere reason (except possibly by a far more advanced science).]</abstract> →
Book review of an anthropologist text arguing for imitation and extensive cultural <a href="https://en.wikipedia.org/wiki/Group_selection">group selection</a> as the driving force of human civilization.
Imitation of other humans is proposed as the unique human cognitive skill that gave us the edge over other primates and all animals, with any kind of raw intelligence being strictly minor.
Further, this extensive multi-level group selectionism implies that most knowledge is embodied in apparently-arbitrary cultural practices, such as traditional food preparation or divination or hunting rituals, which are effective despite lacking any observable rationale and the actual reasons for their efficacy are inaccessible to mere reason (except possibly by a far more advanced science).
- Input: <abstract>Technologies to measure gaze direction and pupil reactivity have become efficient, cheap, and compact and are finding increasing use in many fields, including gaming, marketing, driver safety, military, and healthcare. Besides offering numerous useful applications, the rapidly expanding technology raises serious privacy concerns. Through the lens of advanced data analytics, gaze patterns can reveal much more information than a user wishes and expects to give away. Drawing from a broad range of scientific disciplines, this paper provides a structured overview of personal data that can be inferred from recorded eye activities. Our analysis of the literature shows that eye tracking data may implicitly contain information about a user’s biometric identity, gender, age, ethnicity, body weight, personality traits, drug consumption habits, emotional state, skills and abilities, fears, interests, and sexual preferences. Certain eye tracking measures may even reveal specific cognitive processes and can be used to diagnose various physical and mental health conditions. By portraying the richness and sensitivity of gaze data, this paper provides an important basis for consumer education, privacy impact assessments, and further research into the societal implications of eye tracking.</abstract> →
Technologies to measure gaze direction and pupil reactivity have become efficient, cheap, and compact and are finding increasing use in many fields, including gaming, marketing, driver safety, military, and healthcare. Besides offering numerous useful applications, the rapidly expanding technology raises serious privacy concerns. Through the lens of advanced data analytics, gaze patterns can reveal much more information than a user wishes and expects to give away.
Drawing from a broad range of scientific disciplines, this paper provides a structured overview of personal data that can be inferred from recorded eye activities. Our analysis of the literature shows that eye tracking data may implicitly contain information about a user’s biometric identity, gender, age, ethnicity, body weight, personality traits, drug consumption habits, emotional state, skills and abilities, fears, interests, and sexual preferences. Certain eye tracking measures may even reveal specific cognitive processes and can be used to diagnose various physical and mental health conditions.
By portraying the richness and sensitivity of gaze data, this paper provides an important basis for consumer education, privacy impact assessments, and further research into the societal implications of eye tracking.
- Input: <abstract>There are at least three strategies we might take in approaching controversial issues: (1) we might accept the conclusions of experts on their authority, (2) we might evaluate the relevant evidence and arguments for ourselves, or (3) we might give up on finding the answers. Students of “critical thinking” are regularly advised to follow strategy (2). But strategies (1) and (3) are usually superior to (2), from the standpoint of the goal of gaining true beliefs and avoiding false ones.</abstract> →
There are at least three strategies we might take in approaching controversial issues: (1) we might accept the conclusions of experts on their authority, (2) we might evaluate the relevant evidence and arguments for ourselves, or (3) we might give up on finding the answers.
Students of “critical thinking” are regularly advised to follow strategy (2).
But strategies (1) and (3) are usually superior to (2), from the standpoint of the goal of gaining true beliefs and avoiding false ones.
- Input: <abstract>The 1968 publication of the Rosenthal & Jacobson’s <em>Pygmalion in the Classroom</em> offered the optimistic message that raising teachers’ expectations of their pupils’ potentials would raise their pupils’ intelligence. This claim was, and still is, endorsed by many psychologists and educators. The original study, along with the scores of attempted replications and the acrimonious controversy that followed it, is reviewed, and its consequences discussed.</abstract> →
The 1968 publication of the Rosenthal & Jacobson’s <em>Pygmalion in the Classroom</em> offered the optimistic message that raising teachers’ expectations of their pupils’ potentials would raise their pupils’ intelligence. This claim was, and still is, endorsed by many psychologists and educators.
The original study, along with the scores of attempted replications and the acrimonious controversy that followed it, is reviewed, and its consequences discussed.
- Input: <abstract>Degenerative changes must have a basic cause on the molecular level. For example, the possible role of protein immobilization by means of progressive cross-linking reactions is critically examined in the light of known data on potential cross-linking agents present in the bloodstream, and of related physiologic facts.</abstract> →
Degenerative changes must have a basic cause on the molecular level.
For example, the possible role of protein immobilization by means of progressive cross-linking reactions is critically examined in the light of known data on potential cross-linking agents present in the bloodstream, and of related physiologic facts.
- Input: <abstract>Subscription page for the monthly Gwern.net newsletter. There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the <a href="/changelog" class="id-not">changelog</a>), collations of links or discussions from my subreddit, and book/movie reviews. You can also browse <a href="/doc/newsletter/index">the archives since December 2013</a>.</abstract> →
Subscription page for the monthly Gwern.net newsletter.
There are monthly updates, which will include summaries of projects I’ve worked on that month (the same as the <a href="/changelog" class="id-not">changelog</a>), collations of links or discussions from my subreddit, and book/movie reviews.
You can also browse <a href="/doc/newsletter/index">the archives since December 2013</a>.
- Input: <abstract>[Responds to I. Stevenson’s (1981) criticism of the author’s (see record 1981–25195–001) discussion of life after death. The author argues that he does not consider himself an expert on survival of the human personality after death and he defends his choice of reference materials.]</abstract> →
[Responds to I. Stevenson’s (1981) criticism of the author’s (see record 1981–25195–001) discussion of life after death.
The author argues that he does not consider himself an expert on survival of the human personality after death and he defends his choice of reference materials.]
- Input: <abstract>Many uncertainties surround the practice of meditation. Scientific research on meditation practices does not appear to have a common theoretical perspective and is characterized by poor methodological quality. Firm conclusions on the effects of meditation practices in healthcare cannot be drawn based on the available evidence. Future research on meditation practices must be more rigorous in the design and execution of studies and in the analysis and reporting of results.</abstract> →
Many uncertainties surround the practice of meditation.
Scientific research on meditation practices does not appear to have a common theoretical perspective and is characterized by poor methodological quality. Firm conclusions on the effects of meditation practices in healthcare cannot be drawn based on the available evidence.
Future research on meditation practices must be more rigorous in the design and execution of studies and in the analysis and reporting of results.
- Input: <abstract>[Survey of naval personnel at a shipyard and all attached vessels, with examination of psychiatry referrals. The results indicate that formal records on psychiatric casualties from submarine patrols grossly underestimate the true rate of psychiatric issues among submarine crew, with a more plausible rate of ~3.8%, despite intensive screening.]</abstract> →
[Survey of naval personnel at a shipyard and all attached vessels, with examination of psychiatry referrals.
The results indicate that formal records on psychiatric casualties from submarine patrols grossly underestimate the true rate of psychiatric issues among submarine crew, with a more plausible rate of ~3.8%, despite intensive screening.]
- Input: <abstract>Virtual reality users wearing head-mounted displays can experience the illusion of walking in any direction for infinite distance while, in reality, they are walking a curvilinear path in physical space. This is accomplished by introducing unnoticeable rotations to the virtual environment—a technique called <em>redirected walking</em>. This paper gives an overview of the research that has been performed since redirected walking was first practically demonstrated 15 years ago.</abstract> →
Virtual reality users wearing head-mounted displays can experience the illusion of walking in any direction for infinite distance while, in reality, they are walking a curvilinear path in physical space. This is accomplished by introducing unnoticeable rotations to the virtual environment—a technique called <em>redirected walking</em>.
This paper gives an overview of the research that has been performed since redirected walking was first practically demonstrated 15 years ago.
- Input: <abstract>Decisions as to whether to cut off a losing enterprise (clouded by what already has been invested in the venture) may be facilitated by a new model proposed here—the life cycle model. The model, borrowing an accounting measure (the time adjusted rate of return) to describe the effect of “sunk costs” on the expected rate of return for future costs in a project, is used to examine the relevance of negative feedback to the decision to commit further resources to completion of a project.</abstract> →
Decisions as to whether to cut off a losing enterprise (clouded by what already has been invested in the venture) may be facilitated by a new model proposed here—the life cycle model.
The model, borrowing an accounting measure (the time adjusted rate of return) to describe the effect of “sunk costs” on the expected rate of return for future costs in a project, is used to examine the relevance of negative feedback to the decision to commit further resources to completion of a project.
- Input: <abstract>An empirical law for the rank-order behavior of journal impact factors is found. Using an extensive data base on impact factors including journals on Education, Agrosciences, Geosciences, Biosciences and Environmental, Chemical, Computer, Engineering, Material, Mathematical, Medical and Physical Sciences we have found extremely good fits out—performing other rank-order models. Some extensions to other areas of knowledge are discussed.</abstract>  →
An empirical law for the rank-order behavior of journal impact factors is found.
Using an extensive data base on impact factors including journals on Education, Agrosciences, Geosciences, Biosciences and Environmental, Chemical, Computer, Engineering, Material, Mathematical, Medical and Physical Sciences we have found extremely good fits out—performing other rank-order models.
Some extensions to other areas of knowledge are discussed.
- Input: <abstract>The model evidence is a vital quantity in the comparison of statistical models under the Bayesian paradigm. This paper presents a review of commonly used methods. We outline some guidelines and offer some practical advice. The reviewed methods are compared for two examples: non-nested Gaussian linear regression and covariate subset selection in <a href="https://en.wikipedia.org/wiki/Logistic_regression">logistic regression</a>.</abstract> →
The model evidence is a vital quantity in the comparison of statistical models under the Bayesian paradigm.
This paper presents a review of commonly used methods. We outline some guidelines and offer some practical advice.
The reviewed methods are compared for two examples: non-nested Gaussian linear regression and covariate subset selection in <a href="https://en.wikipedia.org/wiki/Logistic_regression">logistic regression</a>.
- Input: <abstract>Statisticians have been keen to critique statistical aspects of the “replication crisis” in other scientific disciplines. But new statistical tools are often published and promoted without any thought to replicability. This needs to change, argue Anne-Laure Boulesteix, Sabine Hoffmann, Alethea Charlton and Heidi Seibold.</abstract> →
Statisticians have been keen to critique statistical aspects of the “replication crisis” in other scientific disciplines. But new statistical tools are often published and promoted without any thought to replicability.
This needs to change, argue Anne-Laure Boulesteix, Sabine Hoffmann, Alethea Charlton and Heidi Seibold.
- Input: <abstract>We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks. We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.</abstract> →
We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks.
We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.
- Input: <abstract>Fascinated by <a href="https://en.wikipedia.org/wiki/VTuber">virtual YouTubers</a>, I put together a deep neural network system that makes becoming one much easier. More specifically, the network takes as input an image of an anime character’s face and a desired pose, and it outputs another image of the same character in the given pose.</abstract> →
Fascinated by <a href="https://en.wikipedia.org/wiki/VTuber">virtual YouTubers</a>, I put together a deep neural network system that makes becoming one much easier.
More specifically, the network takes as input an image of an anime character’s face and a desired pose, and it outputs another image of the same character in the given pose.
- Input: <abstract>Language modeling is the task of predicting the next word or character in a document. This page lists key recent papers on NLP language modeling and records reported research performance on the following tasks: <a href="https://arxiv.org/abs/1609.07843" title="‘Pointer Sentinel Mixture Models’, Merity et al 2016">WikiText-103</a>, Penn Treebank (Word Level), enwiki8, Text8, One Billion Word, WikiText-2, Hutter Prize, Penn Treebank (Character Level)</abstract> →
Language modeling is the task of predicting the next word or character in a document.
This page lists key recent papers on NLP language modeling and records reported research performance on the following tasks: <a href="https://arxiv.org/abs/1609.07843" title="‘Pointer Sentinel Mixture Models’, Merity et al 2016">WikiText-103</a>, Penn Treebank (Word Level), enwiki8, Text8, One Billion Word, WikiText-2, Hutter Prize, Penn Treebank (Character Level)
- Input: <abstract>When the existence of unicorns, and the curative powers of the horns ascribed to them, began to be questioned, one Danish physician pushed back through curious means—by reframing the unicorn as an aquatic creature of the northern seas. Natalie Lawrence on a fascinating convergence of established folklore, nascent science, and pharmaceutical economy.</abstract> →
When the existence of unicorns, and the curative powers of the horns ascribed to them, began to be questioned, one Danish physician pushed back through curious means—by reframing the unicorn as an aquatic creature of the northern seas.
Natalie Lawrence on a fascinating convergence of established folklore, nascent science, and pharmaceutical economy.
- Input: <abstract>We found that the adverse effect of neighbourhood deprivation on adolescent violent criminality and substance misuse in Sweden was not consistent with a causal inference. Instead, our findings highlight the need to control for familial <a href="https://en.wikipedia.org/wiki/Confounding">confounding</a> in multilevel studies of criminality and substance misuse.</abstract> →
We found that the adverse effect of neighbourhood deprivation on adolescent violent criminality and substance misuse in Sweden was not consistent with a causal inference.
Instead, our findings highlight the need to control for familial <a href="https://en.wikipedia.org/wiki/Confounding">confounding</a> in multilevel studies of criminality and substance misuse.
- Input: <abstract>In the late winter of 2003, a number of livestock animals in the Midwest were poisoned due the accidental contamination of a popular commercial feed with a lethal additive. Although all the evidence indicates this incident had no malicious or terrorist intent, it is informative as a case study highlighting potential security implications with respect to a terrorist event directed at US agriculture.</abstract> →
In the late winter of 2003, a number of livestock animals in the Midwest were poisoned due the accidental contamination of a popular commercial feed with a lethal additive.
Although all the evidence indicates this incident had no malicious or terrorist intent, it is informative as a case study highlighting potential security implications with respect to a terrorist event directed at US agriculture.
- Input: <abstract>[Classic longform essay by SF author <a href="https://en.wikipedia.org/wiki/Neal_Stephenson">Neal Stephenson</a> in which he travels the world tracing the (surprisingly few) transcontinental fiber optic cables which bind the world together and power the Internet. Cables combine cutting-edge technology, deep sea challenges, high finance, and global geo-politics/espionage all in one tiny package.]</abstract> →
[Classic longform essay by SF author <a href="https://en.wikipedia.org/wiki/Neal_Stephenson">Neal Stephenson</a> in which he travels the world tracing the (surprisingly few) transcontinental fiber optic cables which bind the world together and power the Internet.
Cables combine cutting-edge technology, deep sea challenges, high finance, and global geo-politics/espionage all in one tiny package.]
- Input: <abstract>It’s getting harder for new people to join our projects. Newbies are making up a smaller percentage of editors overall than ever before, and the absolute number of newbies is dropping as well. Wikimedia needs to attract and retain more new and diverse editors, and to retain our experienced editors. A stable editing community is critical to the long-term sustainability and quality of both our current projects and our movement. We consider meeting this challenge our top priority.</abstract> →
It’s getting harder for new people to join our projects. Newbies are making up a smaller percentage of editors overall than ever before, and the absolute number of newbies is dropping as well. Wikimedia needs to attract and retain more new and diverse editors, and to retain our experienced editors.
A stable editing community is critical to the long-term sustainability and quality of both our current projects and our movement. We consider meeting this challenge our top priority.
- Input: <abstract>[Originally the draft chapter of the <a href="https://en.wikipedia.org/wiki/Sparkline">sparkline</a> (“Intense, Simple, Word-Sized Graphics”) chapter of <a href="https://en.wikipedia.org/wiki/Edward_Tufte">Edward Tufte’s</a> <em>Beautiful Evidence</em> (2005). This page is a compilation of sparkline examples, links to sparkline software tools, and debates over how best to use sparklines to graph statistical data.]</abstract> →
[Originally the draft chapter of the <a href="https://en.wikipedia.org/wiki/Sparkline">sparkline</a> (“Intense, Simple, Word-Sized Graphics”) chapter of <a href="https://en.wikipedia.org/wiki/Edward_Tufte">Edward Tufte’s</a> <em>Beautiful Evidence</em> (2005).
This page is a compilation of sparkline examples, links to sparkline software tools, and debates over how best to use sparklines to graph statistical data.]
- Input: <abstract>[Discussion with screenshots of the classic <a href="!W">Ridley Scott</a> SF movie <a href="!W"><em>Blade Runner</em></a>, which employs typography extensively. It disconcerts the viewer, with unexpected choices, random capitalization and small caps, corporate branding/advertising, and the mashed-up creole multilingual landscape of noir cyberpunk LA (plus discussion of the buildings and sets, and details such as call costs being correctly inflation-adjusted).]</abstract> →
[Discussion with screenshots of the classic <a href="!W">Ridley Scott</a> SF movie <a href="!W"><em>Blade Runner</em></a>, which employs typography extensively.
It disconcerts the viewer, with unexpected choices, random capitalization and small caps, corporate branding/advertising, and the mashed-up creole multilingual landscape of noir cyberpunk LA (plus discussion of the buildings and sets, and details such as call costs being correctly inflation-adjusted).]
- Input: <abstract>[Gallery of a Japanese restaurant, Issho, which has been redesigned by the minimalist design firm Dutchscot. The design emphasises <em>kintsugi</em>, irregular gold stripes used to repair pottery, white/red/blue, and traditional Japanese cloud motifs.]</abstract> →
[Gallery of a Japanese restaurant, Issho, which has been redesigned by the minimalist design firm Dutchscot.
The design emphasises <em>kintsugi</em>, irregular gold stripes used to repair pottery, white/red/blue, and traditional Japanese cloud motifs.]
- Input: <abstract><em>Large-scale uncitedness</em> refers to the remarkable proportion of articles that do not receive a single citation within 5 years of publication. Equally remarkable is the brief and troubled history of this area of inquiry, which was prone to miscalculation, misinterpretation, and politicization. This article reassesses large-scale uncitedness as both a general phenomenon in the scholarly communication system and a case study of library and information science, where its rate is 72%.</abstract> →
<em>Large-scale uncitedness</em> refers to the remarkable proportion of articles that do not receive a single citation within 5 years of publication. Equally remarkable is the brief and troubled history of this area of inquiry, which was prone to miscalculation, misinterpretation, and politicization.
This article reassesses large-scale uncitedness as both a general phenomenon in the scholarly communication system and a case study of library and information science, where its rate is 72%.
- Input: <abstract>Welcome to SnowCrystals.com! Your online guide to snowflakes, snow crystals, and other ice phenomena. SnowCrystals.com has been bringing you snowflake photos and facts since February 1, 1999. Over 26 million visitors so far! [Photos / books / science; designer snowflakes, how to grow snowflakes, “identical-twin” snowflakes etc]</abstract> →
Welcome to SnowCrystals.com! Your online guide to snowflakes, snow crystals, and other ice phenomena.
SnowCrystals.com has been bringing you snowflake photos and facts since February 1, 1999. Over 26 million visitors so far!
[Photos / books / science; designer snowflakes, how to grow snowflakes, “identical-twin” snowflakes etc]
- Input: <abstract>Blog of Jose Luis Ricon (<a href="https://x.com/ArtirKel">Twitter</a>), machine learning engineer. Ricon blogs primarily about economics and progress studies, mixing link compilations with more researched essays such as about the economic (in)efficiency of the USSR, or the extent to which tutoring &amp; “direct instruction” boost educational achievement.</abstract> →
Blog of Jose Luis Ricon (<a href="https://x.com/ArtirKel">Twitter</a>), machine learning engineer.
Ricon blogs primarily about economics and progress studies, mixing link compilations with more researched essays such as about the economic (in)efficiency of the USSR, or the extent to which tutoring &amp; “direct instruction” boost educational achievement.
- Input: <abstract>[Rebuttal letter: the gravitostat is supported by hypergravity; astronaut microgravity experiments are only weak counterevidence because microgravity and space travel badly damages health in many ways, hiding any potential weight gain. The gravitostat may fit in the two-systems model of weight, in which case a testable prediction is that it should have different effects in rodents with different weight/leptin combinations.]</abstract> →
[Rebuttal letter: the gravitostat is supported by hypergravity; astronaut microgravity experiments are only weak counterevidence because microgravity and space travel badly damages health in many ways, hiding any potential weight gain.
The gravitostat may fit in the two-systems model of weight, in which case a testable prediction is that it should have different effects in rodents with different weight/leptin combinations.]
- Input: <abstract>[Official Instagram account of Nathan W. Pyle’s popular webcomic <em>Strange Planet</em>, which recounts in a deadpan manner ordinary human activities as conducted by literal-minded aliens (which <a href="https://en.wikipedia.org/wiki/Defamiliarization">defamiliarizes</a> them). Pyle does not appear to have a webcomic website for <em>Strange Planet</em>, and the Instagram account to be his primary form of releasing SP comics.]</abstract> →
[Official Instagram account of Nathan W. Pyle’s popular webcomic <em>Strange Planet</em>, which recounts in a deadpan manner ordinary human activities as conducted by literal-minded aliens (which <a href="https://en.wikipedia.org/wiki/Defamiliarization">defamiliarizes</a> them).
Pyle does not appear to have a webcomic website for <em>Strange Planet</em>, and the Instagram account to be his primary form of releasing SP comics.]
- Input: <abstract>Humour in science assumes many forms and shapes. It appear as hoaxes and spoofs; individuals and groups of scientists edit special satirical and humorous journals; anthologies and books on humour in science are published. All these find their representation in this review, which contains also many examples of gamesmanship in science, obscurantism and puns that contribute to the lighter side of science.</abstract> →
Humour in science assumes many forms and shapes. It appear as hoaxes and spoofs; individuals and groups of scientists edit special satirical and humorous journals; anthologies and books on humour in science are published.
All these find their representation in this review, which contains also many examples of gamesmanship in science, obscurantism and puns that contribute to the lighter side of science.
- Input: <abstract>Statistical methodology has played a key role in scientific animal breeding. ~100 years of statistical developments in animal breeding are reviewed. Some of the scientific foundations of the field are discussed, and many milestones are examined from historical and critical perspectives.
The review concludes with a discussion of some future challenges and opportunities arising from the massive amount of data generated by livestock, plant, and human genome projects.</abstract> →
Statistical methodology has played a key role in scientific animal breeding.
~100 years of statistical developments in animal breeding are reviewed. Some of the scientific foundations of the field are discussed, and many milestones are examined from historical and critical perspectives.
The review concludes with a discussion of some future challenges and opportunities arising from the massive amount of data generated by livestock, plant, and human genome projects.
- Input: <abstract>The full data concerning the history of attenuated poliovirus strains developed by one of us (Sabin 1965) for vaccine production do not appear in a single journal. Over the past few years we have had frequent requests for the details such as isolation and attenuation and accordingly we felt that bringing the data together in the report below would be both helpful and informative to those involved in the production and control of poliovirus vaccine (oral) prepared from these strains.</abstract> →
The full data concerning the history of attenuated poliovirus strains developed by one of us (Sabin 1965) for vaccine production do not appear in a single journal.
Over the past few years we have had frequent requests for the details such as isolation and attenuation and accordingly we felt that bringing the data together in the report below would be both helpful and informative to those involved in the production and control of poliovirus vaccine (oral) prepared from these strains.
- Input: <abstract>In spite of all its announced advantages, the implementation of mastery learning instruction often falls short of theoretical expectations. As discussed under the four major characteristics of mastery learning [systematic design of instruction/instructional correctives/ample time to learn/clear criterion of mastery], these implementation weaknesses pose serious problems for unsuspecting students, teachers, and instructional designers alike.</abstract> →
In spite of all its announced advantages, the implementation of mastery learning instruction often falls short of theoretical expectations.
As discussed under the four major characteristics of mastery learning [systematic design of instruction/instructional correctives/ample time to learn/clear criterion of mastery], these implementation weaknesses pose serious problems for unsuspecting students, teachers, and instructional designers alike.
- Input: <abstract><a href="!W">Hypertext</a> databases can be produced by converting existing text documents to electronic form. The basic task in conversion is identification of fragments. We illustrate that this is not always a straightforward process with an analysis of the <a href="!W">Oxford English Dictionary</a>.</abstract> →
<a href="!W">Hypertext</a> databases can be produced by converting existing text documents to electronic form. The basic task in conversion is identification of fragments.
We illustrate that this is not always a straightforward process with an analysis of the <a href="!W">Oxford English Dictionary</a>.
- Input: <abstract>We show how eye-tracking corpora can be used to improve sentence compression models, presenting a novel multi-task learning algorithm based on multi-layer LSTMs. We obtain performance competitive with or better than state-of-the-art approaches.</abstract> →
We show how eye-tracking corpora can be used to improve sentence compression models, presenting a novel multi-task learning algorithm based on multi-layer LSTMs.
We obtain performance competitive with or better than state-of-the-art approaches.
- Input: <abstract>In recent years, a number of prominent computer scientists, along with academics in fields such as philosophy and physics, have lent credence to the notion that machines may one day become as large as humans. Many have further argued that machines could even come to exceed human size by a large margin. However, there are at least seven distinct arguments that preclude this outcome. We show that it is not only implausible that machines will ever exceed human size, but in fact impossible.</abstract> →
In recent years, a number of prominent computer scientists, along with academics in fields such as philosophy and physics, have lent credence to the notion that machines may one day become as large as humans. Many have further argued that machines could even come to exceed human size by a large margin.
However, there are at least seven distinct arguments that preclude this outcome. We show that it is not only implausible that machines will ever exceed human size, but in fact impossible.
- Input: <abstract>We discuss the idea that computers might soon help mathematicians to prove theorems in areas where they have not previously been useful. Furthermore we argue that these same computer tools will also help us in the communication and teaching of mathematics.</abstract> →
We discuss the idea that computers might soon help mathematicians to prove theorems in areas where they have not previously been useful.
Furthermore we argue that these same computer tools will also help us in the communication and teaching of mathematics.
- Input: <abstract>We compare the impact of hardware advancement and algorithm advancement for <a href="https://en.wikipedia.org/wiki/Boolean_satisfiability_problem#Algorithms_for_solving_SAT" >SAT solving</a> over the last two decades. In particular, we compare 20-year-old SAT-solvers on new computer hardware with modern SAT-solvers on 20-year-old hardware. Our findings show that the progress on the algorithmic side has at least as much impact as the progress on the hardware side.</abstract> →
We compare the impact of hardware advancement and algorithm advancement for <a href="https://en.wikipedia.org/wiki/Boolean_satisfiability_problem#Algorithms_for_solving_SAT" >SAT solving</a> over the last two decades. In particular, we compare 20-year-old SAT-solvers on new computer hardware with modern SAT-solvers on 20-year-old hardware.
Our findings show that the progress on the algorithmic side has at least as much impact as the progress on the hardware side.
- Input: <abstract>This review introduces methods of analyzing data arising from studies where the response variable is the length of time taken to reach a certain end-point, often death. The <a href="https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator" >Kaplan-Meier</a> methods, log rank test and Cox’s proportional hazards model are described.</abstract> →
This review introduces methods of analyzing data arising from studies where the response variable is the length of time taken to reach a certain end-point, often death.
The <a href="https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator" >Kaplan-Meier</a> methods, log rank test and Cox’s proportional hazards model are described.
- Input: <abstract>6⁄12 men wintering at an isolated Antarctic base sequentially developed symptoms and signs of a <a href="!W">common cold</a> after 17 weeks of complete isolation. Examination of specimens taken from the men in relation to the outbreak has not revealed a causative agent.</abstract> →
6⁄12 men wintering at an isolated Antarctic base sequentially developed symptoms and signs of a <a href="!W">common cold</a> after 17 weeks of complete isolation.
Examination of specimens taken from the men in relation to the outbreak has not revealed a causative agent.
- Input: <abstract>A resolution of the St Petersburg paradox is presented. In contrast to the standard resolution, utility is not required. Instead, the time-average performance of the lottery is computed. The final result can be phrased mathematically identically to Daniel Bernoulli’s resolution, which uses logarithmic utility, but is derived using a conceptually different argument. The advantage of the time resolution is the elimination of arbitrary utility functions.</abstract> →
A resolution of the St Petersburg paradox is presented.
In contrast to the standard resolution, utility is not required. Instead, the time-average performance of the lottery is computed. The final result can be phrased mathematically identically to Daniel Bernoulli’s resolution, which uses logarithmic utility, but is derived using a conceptually different argument.
The advantage of the time resolution is the elimination of arbitrary utility functions.
- Input: <abstract>Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods. In this paper, we take a closer look at the field to see if this is actually true. We find flaws in the experimental methodology of numerous metric learning papers, and show that the actual improvements over time have been marginal at best.</abstract> →
Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods.
In this paper, we take a closer look at the field to see if this is actually true.
We find flaws in the experimental methodology of numerous metric learning papers, and show that the actual improvements over time have been marginal at best.
- Input: <abstract>Electron-electron interactions and detector bandwidth limit the maximal imaging speed of single-beam scanning electron microscopes. We use multiple electron beams in a single column and detect secondary electrons in parallel to increase the imaging speed by close to two orders of magnitude and demonstrate imaging for a variety of samples ranging from biological brain tissue to semiconductor wafers.</abstract> →
Electron-electron interactions and detector bandwidth limit the maximal imaging speed of single-beam scanning electron microscopes.
We use multiple electron beams in a single column and detect secondary electrons in parallel to increase the imaging speed by close to two orders of magnitude and demonstrate imaging for a variety of samples ranging from biological brain tissue to semiconductor wafers.
- Input: <abstract>An emerging body of data suggests that pluripotent stem cells may be able to differentiate to form eggs and sperm. We discuss the state of the science and the potential social implications and offer recommendations for addressing some of the ethical and policy issues that would be raised by the availability of stem cell-derived gametes.</abstract> →
An emerging body of data suggests that pluripotent stem cells may be able to differentiate to form eggs and sperm.
We discuss the state of the science and the potential social implications and offer recommendations for addressing some of the ethical and policy issues that would be raised by the availability of stem cell-derived gametes.
- Input: <abstract>Imaging as a means of scientific data storage has evolved rapidly over the past century from hand drawings, to photography, to digital images. Only recently can sufficiently large datasets be acquired, stored, and processed such that tissue digitization can actually reveal more than direct observation of tissue. One field where this transformation is occurring is connectomics: the mapping of neural connections in large volumes of digitized brain tissue.</abstract> →
Imaging as a means of scientific data storage has evolved rapidly over the past century from hand drawings, to photography, to digital images. Only recently can sufficiently large datasets be acquired, stored, and processed such that tissue digitization can actually reveal more than direct observation of tissue.
One field where this transformation is occurring is connectomics: the mapping of neural connections in large volumes of digitized brain tissue.
- Input: <abstract>Recent research in artificial intelligence and machine learning has largely emphasized general-purpose learning and ever-larger training sets and more and more compute. In contrast, I propose a hybrid, knowledge-driven, reasoning-based approach, centered around cognitive models, that could provide the substrate for a richer, more robust AI than is currently possible.</abstract> →
Recent research in artificial intelligence and machine learning has largely emphasized general-purpose learning and ever-larger training sets and more and more compute.
In contrast, I propose a hybrid, knowledge-driven, reasoning-based approach, centered around cognitive models, that could provide the substrate for a richer, more robust AI than is currently possible.
- Input: <abstract>A brief review of interaction-free measurements (IFM) is presented. The IFM is a solution of a quantum puzzle: How to test a bomb which explodes on every test without exploding it? This paper was given in the Oxford conference in honor of Roger Penrose.</abstract> →
A brief review of interaction-free measurements (IFM) is presented. The IFM is a solution of a quantum puzzle: How to test a bomb which explodes on every test without exploding it?
This paper was given in the Oxford conference in honor of Roger Penrose.
- Input: <abstract>This paper explores the physics of the what-if question “what if the entire Earth was instantaneously replaced with an equal volume of closely packed, but uncompressed blueberries?” While the assumption may be absurd, the consequences can be explored rigorously using elementary physics. The result is not entirely dissimilar to a small ocean-world exoplanet.</abstract> →
This paper explores the physics of the what-if question “what if the entire Earth was instantaneously replaced with an equal volume of closely packed, but uncompressed blueberries?” While the assumption may be absurd, the consequences can be explored rigorously using elementary physics.
The result is not entirely dissimilar to a small ocean-world exoplanet.
- Input: <abstract>We show that state-of-the-art services for creating trusted timestamps in blockchain-based networks do not adequately allow for timestamping of web pages. They accept data by value (eg. images and text), but not by reference (eg. URIs of web pages). Also, we discuss difficulties in repeatedly generating the same cryptographic hash value of an archived web page. We then introduce several requirements to be fulfilled in order to produce repeatable hash values for archived web pages.</abstract> →
We show that state-of-the-art services for creating trusted timestamps in blockchain-based networks do not adequately allow for timestamping of web pages. They accept data by value (eg. images and text), but not by reference (eg. URIs of web pages). Also, we discuss difficulties in repeatedly generating the same cryptographic hash value of an archived web page.
We then introduce several requirements to be fulfilled in order to produce repeatable hash values for archived web pages.
- Input: <abstract>We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and <a href="https://en.wikipedia.org/wiki/Beam_search" >beam search</a>. We show both deficiencies and improvements over the quality of phrase-based statistical machine translation.</abstract> →
We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and <a href="https://en.wikipedia.org/wiki/Beam_search" >beam search</a>.
We show both deficiencies and improvements over the quality of phrase-based statistical machine translation.
- Input: <abstract>We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the hand-engineered features of existing state-of-the-art models. To our knowledge, this is the first time deep learning has been applied to theorem proving on a large scale.</abstract> →
We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics.
We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the hand-engineered features of existing state-of-the-art models.
To our knowledge, this is the first time deep learning has been applied to theorem proving on a large scale.
- Input: <abstract>The statistic p(rep) estimates the probability of replicating an effect. It captures traditional publication criteria for signal-to-noise ratio, while avoiding parametric inference and the resulting Bayesian dilemma. In concert with <a href="https://en.wikipedia.org/wiki/Effect_sizes" >effect size</a> and replication intervals, p(rep) provides all of the information now used in evaluating research, while avoiding many of the pitfalls of traditional statistical inference.</abstract> →
The statistic p(rep) estimates the probability of replicating an effect. It captures traditional publication criteria for signal-to-noise ratio, while avoiding parametric inference and the resulting Bayesian dilemma.
In concert with <a href="https://en.wikipedia.org/wiki/Effect_sizes" >effect size</a> and replication intervals, p(rep) provides all of the information now used in evaluating research, while avoiding many of the pitfalls of traditional statistical inference.
- Input: <abstract>Intense meditation practices help to achieve a harmony between body and mind. Meditation practices influence brain functions, induce various intrinsic neural plasticity events, modulate autonomic, metabolic, endocrine, and immune functions and thus mediate global regulatory changes in various behavioral states including sleep. This brief review focuses on the effect of meditation as a self regulatory phenomenon on sleep.</abstract> →
Intense meditation practices help to achieve a harmony between body and mind. Meditation practices influence brain functions, induce various intrinsic neural plasticity events, modulate autonomic, metabolic, endocrine, and immune functions and thus mediate global regulatory changes in various behavioral states including sleep.
This brief review focuses on the effect of meditation as a self regulatory phenomenon on sleep.
- Input: <abstract><a href="!W">Modafinil</a> is a wakefulness-promoting agent that is known to be used off-label as a cognitive enhancer and for the treatment of <a href="https://en.wikipedia.org/wiki/Attention_deficit_hyperactivity_disorder" >attention deficit hyperactivity disorder</a> (ADHD).<sup>1</sup> There are increasing case reports of <a href="/modafinil" >Modafinil</a>-induced psychosis; however, this is the first to report a patient with ADHD to develop psychosis from Modafinil use.</abstract> →
<a href="!W">Modafinil</a> is a wakefulness-promoting agent that is known to be used off-label as a cognitive enhancer and for the treatment of <a href="https://en.wikipedia.org/wiki/Attention_deficit_hyperactivity_disorder" >attention deficit hyperactivity disorder</a> (ADHD).<sup>1</sup>
There are increasing case reports of <a href="/modafinil" >Modafinil</a>-induced psychosis; however, this is the first to report a patient with ADHD to develop psychosis from Modafinil use.
- Input: <abstract>We critically examine the evidence for the idea that encephalization quotients increase with time. We find that human-like intelligence is not a convergent feature of evolution. Implications for the search for extraterrestrial intelligence are discussed.</abstract> →
We critically examine the evidence for the idea that encephalization quotients increase with time.
We find that human-like intelligence is not a convergent feature of evolution.
Implications for the search for extraterrestrial intelligence are discussed.
- Input: <abstract>Misalignment between the timing of sleep and the circadian pacemaker has been linked to depression symptoms. This study sought to extend earlier findings by comparing sleep and circadian markers in healthy controls and individuals with major depression. Two markers of circadian misalignment correlated with depression severity in the depressed group.</abstract> →
Misalignment between the timing of sleep and the circadian pacemaker has been linked to depression symptoms.
This study sought to extend earlier findings by comparing sleep and circadian markers in healthy controls and individuals with major depression.
Two markers of circadian misalignment correlated with depression severity in the depressed group.
- Input: <abstract>Statistical analysis of repeat misprints in scientific citations leads to the conclusion that about 80% of scientific citations are copied from the lists of references used in other papers. Based on this finding a mathematical theory of citing is constructed. It leads to the conclusion that a large number of citations does not have to be a result of paper’s extraordinary qualities, but can be explained by the ordinary law of chances.</abstract> →
Statistical analysis of repeat misprints in scientific citations leads to the conclusion that about 80% of scientific citations are copied from the lists of references used in other papers.
Based on this finding a mathematical theory of citing is constructed.
It leads to the conclusion that a large number of citations does not have to be a result of paper’s extraordinary qualities, but can be explained by the ordinary law of chances.
- Input: <abstract>Panda is regarded as Chinese national treasure. Most people always thought they were cute and just ate bamboo and had never imagined a panda could be vicious. Giant panda attacks on human are rare. There, we present 3 cases of giant panda attacks on humans at the Panda House at Beijing Zoo from September 2006 to June 2009 to warn people of the giant panda’s potentially dangerous behavior.</abstract> →
Panda is regarded as Chinese national treasure. Most people always thought they were cute and just ate bamboo and had never imagined a panda could be vicious. Giant panda attacks on human are rare.
There, we present 3 cases of giant panda attacks on humans at the Panda House at Beijing Zoo from September 2006 to June 2009 to warn people of the giant panda’s potentially dangerous behavior.
- Input: <abstract>In 11 studies, we found that participants typically did not enjoy spending 6–15 minutes in a room by themselves with nothing to do but think, that they enjoyed doing mundane external activities much more, and that many preferred to administer electric shocks to themselves instead of being left alone with their thoughts. Most people seem to prefer to be doing something rather than nothing, even if that something is negative.</abstract> →
In 11 studies, we found that participants typically did not enjoy spending 6–15 minutes in a room by themselves with nothing to do but think, that they enjoyed doing mundane external activities much more, and that many preferred to administer electric shocks to themselves instead of being left alone with their thoughts.
Most people seem to prefer to be doing something rather than nothing, even if that something is negative.
- Input: <abstract>Many good tutorials exist but in the last few years, <a href="!W">transformers</a> have mostly become simpler, so that it is now much more straightforward to explain how modern architectures work. This post is an attempt to explain directly [in <a href="!W">PyTorch</a>] how modern transformers work, and why, without some of the historical baggage.</abstract> →
Many good tutorials exist but in the last few years, <a href="!W">transformers</a> have mostly become simpler, so that it is now much more straightforward to explain how modern architectures work.
This post is an attempt to explain directly [in <a href="!W">PyTorch</a>] how modern transformers work, and why, without some of the historical baggage.
- Input: <abstract>I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales better than all alternatives when applied to modern convolutional neural networks.</abstract> →
I present a new way to parallelize the training of convolutional neural networks across multiple GPUs.
The method scales better than all alternatives when applied to modern convolutional neural networks.
- Input: <abstract>Bayesian reasoning has been applied formally to statistical inference, machine learning and analysing scientific method. Here I apply it informally to more common forms of inference, namely natural language arguments. I analyse a variety of traditional fallacies, deductive, inductive and causal, and find more merit in them than is generally acknowledged. Bayesian principles provide a framework for understanding ordinary arguments which is well worth developing.</abstract> →
Bayesian reasoning has been applied formally to statistical inference, machine learning and analysing scientific method.
Here I apply it informally to more common forms of inference, namely natural language arguments.
I analyse a variety of traditional fallacies, deductive, inductive and causal, and find more merit in them than is generally acknowledged.
Bayesian principles provide a framework for understanding ordinary arguments which is well worth developing.
- Input: <abstract>An <a href="!W">autistic</a> young man and a normal control were asked to factorize numbers and to recognize and generate primes. Both subjects made a similar of errors and employed similar strategies, but they differed markedly in the speeds at which the arithmetical operations were carried out.</abstract> →
An <a href="!W">autistic</a> young man and a normal control were asked to factorize numbers and to recognize and generate primes.
Both subjects made a similar of errors and employed similar strategies, but they differed markedly in the speeds at which the arithmetical operations were carried out.
- Input: <abstract>We provide a novel search technique, which uses a <a href="https://en.wikipedia.org/wiki/Multilevel_model">hierarchical model</a> and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels.</abstract> →
We provide a novel search technique, which uses a <a href="https://en.wikipedia.org/wiki/Multilevel_model">hierarchical model</a> and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images.
We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels.
- Input: <abstract>Decades of research have highlighted the amygdala’s influential role in fear. We found that inhalation of 35% CO<sub>2</sub> evoked not only fear, but also panic attacks, in 3 rare patients with bilateral amygdala damage. These results indicate that the amygdala is not required for fear and panic, and make an important distinction between fear triggered by external threats from the environment versus fear triggered internally by CO<sub>2</sub>.</abstract> →
Decades of research have highlighted the amygdala’s influential role in fear.
We found that inhalation of 35% CO<sub>2</sub> evoked not only fear, but also panic attacks, in 3 rare patients with bilateral amygdala damage.
These results indicate that the amygdala is not required for fear and panic, and make an important distinction between fear triggered by external threats from the environment versus fear triggered internally by CO<sub>2</sub>.
- Input: <abstract>This paper describes a reduction from the halting problem of Turing machines to subtype checking in Java. It follows that subtype checking in Java is undecidable, which answers a question posed by Kennedy and Pierce in 2007. It also follows that Java’s type checker can recognize any recursive language, which improves a result of Gil and Levy from 2016. The latter point is illustrated by a parser generator for fluent interfaces.</abstract> →
This paper describes a reduction from the halting problem of Turing machines to subtype checking in Java.
It follows that subtype checking in Java is undecidable, which answers a question posed by Kennedy and Pierce in 2007. It also follows that Java’s type checker can recognize any recursive language, which improves a result of Gil and Levy from 2016.
The latter point is illustrated by a parser generator for fluent interfaces.
- Input: <abstract>Soldier crabs <em>Mictyris guinotae</em> exhibit pronounced swarming behavior. The swarms of the crabs tolerant of perturbations. In computer models and laboratory experiments we demonstrate that swarms of soldier crabs can implement logical gates when placed in a geometrically constrained environment.</abstract> →
Soldier crabs <em>Mictyris guinotae</em> exhibit pronounced swarming behavior. The swarms of the crabs tolerant of perturbations.
In computer models and laboratory experiments we demonstrate that swarms of soldier crabs can implement logical gates when placed in a geometrically constrained environment.
- Input: <abstract>We introduce algorithms that use predictions from machine learning applied to the input to circumvent worst-case analysis. We aim for algorithms that have near optimal performance when these predictions are good, but recover the prediction-less worst case behavior when the predictions have large errors.</abstract> →
We introduce algorithms that use predictions from machine learning applied to the input to circumvent worst-case analysis.
We aim for algorithms that have near optimal performance when these predictions are good, but recover the prediction-less worst case behavior when the predictions have large errors.
- Input: <abstract>On Earth, the development of technology required easy access to open air combustion, which is only possible when oxygen partial pressure, P(O<sub>2</sub>), is above 18%. This suggests that only planets with atmospheric oxygen concentrations will be capable of developing “advanced” technospheres and hence detectable techno-signatures.</abstract> →
On Earth, the development of technology required easy access to open air combustion, which is only possible when oxygen partial pressure, P(O<sub>2</sub>), is above 18%.
This suggests that only planets with atmospheric oxygen concentrations will be capable of developing “advanced” technospheres and hence detectable techno-signatures.
- Input: <abstract>GPT-3 calculating derivatives. It learned about <a href="!W">power rule</a>https://en.wikipedia.org/wiki/Power_rule. Maybe a prompt with all rules of calculus will make it able to do all sorts of calculations...I was not omitting <code>^</code> at first but then it didn’t give me answers and I realized omitting <code>^</code> gave the answers so I just carried on.</abstract> →
GPT-3 calculating derivatives.
It learned about <a href="!W">power rule</a>https://en.wikipedia.org/wiki/Power_rule. Maybe a prompt with all rules of calculus will make it able to do all sorts of calculations.
...I was not omitting <code>^</code> at first but then it didn’t give me answers and I realized omitting <code>^</code> gave the answers so I just carried on.
- Input: <abstract>This paper reviews arguments for land value taxation (LVT) as a tool to stop urban sprawl, eliminate land speculation, reduce housing costs, and provide tax relief. It is found that LVT would increase, not lower land prices and would provide only a small incentive to building construction. LVT would not favorably affect the distribution of wealth, nor reduce housing costs. It could provide some residential tax relief, but less effectively than other methods such as a progressive property tax.</abstract> →
This paper reviews arguments for land value taxation (LVT) as a tool to stop urban sprawl, eliminate land speculation, reduce housing costs, and provide tax relief.
It is found that LVT would increase, not lower land prices and would provide only a small incentive to building construction. LVT would not favorably affect the distribution of wealth, nor reduce housing costs. It could provide some residential tax relief, but less effectively than other methods such as a progressive property tax.
- Input: <abstract>Dog cloning as a concept is no longer infeasible. Starting with Snuppy, the first cloned dog in the world, somatic cell nuclear transfer (SCNT) has been continuously developed and used for diverse purposes. In this article we summarise the current method for SCNT, the normality of cloned dogs and the application of dog cloning not only for personal reasons, but also for public purposes.</abstract> →
Dog cloning as a concept is no longer infeasible.
Starting with Snuppy, the first cloned dog in the world, somatic cell nuclear transfer (SCNT) has been continuously developed and used for diverse purposes.
In this article we summarise the current method for SCNT, the normality of cloned dogs and the application of dog cloning not only for personal reasons, but also for public purposes.
- Input: <abstract>When asked whether he would discuss man in the<em>Origins of the Species</em>, Darwin replied, ‘I think I shall avoid the subject, as so surrounded with prejudices, though I fully admit it is the highest and most interesting problem for the naturalist’. Galton on the other hand replied to the same question, ‘I shall treat man and see what the theory of heredity of variations and the principles of natural selection mean when applied to man’ (Pearson 1914–30, Vol. II, p. 86).</abstract> →
When asked whether he would discuss man in the<em>Origins of the Species</em>, Darwin replied, ‘I think I shall avoid the subject, as so surrounded with prejudices, though I fully admit it is the highest and most interesting problem for the naturalist’.
Galton on the other hand replied to the same question, ‘I shall treat man and see what the theory of heredity of variations and the principles of natural selection mean when applied to man’ (Pearson 1914–30, Vol. II, p. 86).
- Input: <abstract>Frank Plumpton Ramsey was born in February 1903, and he died in January 1930—just before his 27<sup>th</sup> birthday. In his short life he produced an extraordinary amount of profound and original work in economics, mathematics and logic as well as in philosophy: work which in all these fields is still, over sixty years on, extremely influential.</abstract> →
Frank Plumpton Ramsey was born in February 1903, and he died in January 1930—just before his 27<sup>th</sup> birthday.
In his short life he produced an extraordinary amount of profound and original work in economics, mathematics and logic as well as in philosophy: work which in all these fields is still, over sixty years on, extremely influential.
- Input: <abstract>One of the earliest ideas about vision is that it depends on light that streams out of the eye and detects surrounding objects. This view was attacked in its own time and finally disproved more than 2000 years later. Yet the idea of a beam leaving the eye persisted in beliefs both about the evil eye and the power of a lover's gaze. It is still widely held among both children and adults.</abstract> →
One of the earliest ideas about vision is that it depends on light that streams out of the eye and detects surrounding objects. This view was attacked in its own time and finally disproved more than 2000 years later.
Yet the idea of a beam leaving the eye persisted in beliefs both about the evil eye and the power of a lover's gaze. It is still widely held among both children and adults.
- Input: <abstract>For 64 undergraduates varied musical selections did not offset scores on Sequential Tests of Educational Progress but scores on the easier sections were higher than those on more difficult ones. Scores made with familiar music were higher than those with unfamiliar music.</abstract> →
For 64 undergraduates varied musical selections did not offset scores on Sequential Tests of Educational Progress but scores on the easier sections were higher than those on more difficult ones.
Scores made with familiar music were higher than those with unfamiliar music.
- Input: <abstract>This review highlights the importance of recognizing the possibility for doing harm when intentions are good. It describes several examples showing that well-planned and adequately executed programs provide no guarantee for safety or efficacy. The author concludes with recommendations for scientifically credible evaluations to promote progress in the field of crime prevention.</abstract> →
This review highlights the importance of recognizing the possibility for doing harm when intentions are good.
It describes several examples showing that well-planned and adequately executed programs provide no guarantee for safety or efficacy.
The author concludes with recommendations for scientifically credible evaluations to promote progress in the field of crime prevention.
- Input: <abstract>Many of the statistical models that could provide an accurate, interesting, and testable explanation for the structure of a data set turn out to have intractable likelihood functions. The method of approximate Bayesian computation (ABC) has become a popular approach for tackling such models. This review gives an overview of the method and the main issues and challenges that are the subject of current research.</abstract> →
Many of the statistical models that could provide an accurate, interesting, and testable explanation for the structure of a data set turn out to have intractable likelihood functions.
The method of approximate Bayesian computation (ABC) has become a popular approach for tackling such models.
This review gives an overview of the method and the main issues and challenges that are the subject of current research.
- Input: <abstract>We review Stigler's diet problem, its impact on linear programming and operations research, and we determine minimum cost diets using updated nutritional and cost data. We also discuss how Stigler's diet problem formulation and its extensions have, over the years, influenced dietitians and nutritionists in their search for more wholesome but cost-effective diets.</abstract> →
We review Stigler's diet problem, its impact on linear programming and operations research, and we determine minimum cost diets using updated nutritional and cost data.
We also discuss how Stigler's diet problem formulation and its extensions have, over the years, influenced dietitians and nutritionists in their search for more wholesome but cost-effective diets.
- Input: <abstract>A sequence of athletic records forms by definition a monotonic sequence. Since it is reasonable to assume this to be bounded, it follows that a limit must exist to future performance. It has been proposed [1, 2] that this can be estimated by use of a curve-fitting procedure on existing data, such as may be found in many compendia of athletic records (eg. [3]).</abstract> →
A sequence of athletic records forms by definition a monotonic sequence.
Since it is reasonable to assume this to be bounded, it follows that a limit must exist to future performance.
It has been proposed [1, 2] that this can be estimated by use of a curve-fitting procedure on existing data, such as may be found in many compendia of athletic records (eg. [3]).
- Input: <abstract>Most targeted anticancer therapies fail due to drug resistance evolution. Here we show that tumor evolution can be reproducibly redirected to engineer therapeutic opportunity, regardless of the exact ensemble of pre-existing genetic heterogeneity. We develop a selection gene drive system that is stably introduced into cancer cells and is composed of two genes, or switches, that couple an inducible fitness advantage with a shared fitness cost. Using stochastic models of evolutionary dynamics, we identify the design criteria for selection gene drives. We then build prototypes that harness the selective pressure of multiple approved tyrosine kinase inhibitors and employ therapeutic mechanisms as diverse as prodrug catalysis and immune activity induction. We show that selection gene drives can eradicate diverse forms of genetic resistance in vitro. Finally, we demonstrate that model-informed switch engagement effectively targets pre-existing resistance in mouse models of solid tumors. These results establish selection gene drives as a powerful framework for evolution-guided anticancer therapy.</abstract> →
Most targeted anticancer therapies fail due to drug resistance evolution. Here we show that tumor evolution can be reproducibly redirected to engineer therapeutic opportunity, regardless of the exact ensemble of pre-existing genetic heterogeneity.
We develop a selection gene drive system that is stably introduced into cancer cells and is composed of two genes, or switches, that couple an inducible fitness advantage with a shared fitness cost. Using stochastic models of evolutionary dynamics, we identify the design criteria for selection gene drives. We then build prototypes that harness the selective pressure of multiple approved tyrosine kinase inhibitors and employ therapeutic mechanisms as diverse as prodrug catalysis and immune activity induction.
We show that selection gene drives can eradicate diverse forms of genetic resistance in vitro.
Finally, we demonstrate that model-informed switch engagement effectively targets pre-existing resistance in mouse models of solid tumors.
These results establish selection gene drives as a powerful framework for evolution-guided anticancer therapy.
- Input: <abstract>The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.</abstract> →
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns.
This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions.
We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness.
We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
- Input: <abstract>Generative models, such as diffusion models (DMs), variational autoencoders (VAEs), and generative adversarial networks (GANs), produce images with a level of authenticity that makes them nearly indistinguishable from real photos and artwork. While this capability is beneficial for many industries, the difficulty of identifying synthetic images leaves online media platforms vulnerable to impersonation and misinformation attempts. To support the development of defensive methods, we introduce ImagiNet, a high-resolution and balanced dataset for synthetic image detection, designed to mitigate potential biases in existing resources. It contains 200K examples, spanning four content categories: photos, paintings, faces, and uncategorized.</abstract> →
Generative models, such as diffusion models (DMs), variational autoencoders (VAEs), and generative adversarial networks (GANs), produce images with a level of authenticity that makes them nearly indistinguishable from real photos and artwork. While this capability is beneficial for many industries, the difficulty of identifying synthetic images leaves online media platforms vulnerable to impersonation and misinformation attempts.
To support the development of defensive methods, we introduce <strong>ImagiNet</strong>, a high-resolution and balanced dataset for synthetic image detection, designed to mitigate potential biases in existing resources. It contains 200K examples, spanning 4 content categories: photos, paintings, faces, and uncategorized. Synthetic images are produced with open-source and proprietary generators, whereas real counterparts of the same content type are collected from public datasets.
- Input: <abstract>We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas. In this paper, we outline the design philosophy, environment details, and their world generation API. We also showcase the additional capabilities brought by the unified API between Minigrid and Miniworld through case studies on transfer learning (for both RL agents and humans) between the different observation spaces. The source code of Minigrid and Miniworld can be found at this https URL[Minigrid, Miniworld] along with their documentation at https://[minigrid, miniworld].this http URL.</abstract> →
We present the <strong>Minigrid</strong> and <strong>Miniworld</strong> libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas.
In this paper, we outline the design philosophy, environment details, and their world generation API.
We also showcase the additional capabilities brought by the unified API between Minigrid and Miniworld through case studies on transfer learning (for both RL agents and humans) between the different observation spaces.
The source code of Minigrid and Miniworld can be found at <a href="https://github.com/Farama-Foundation/Minigrid">GitHub</a>, Miniworld along with their documentation at <a href="https://minigrid.farama.org/"><code>minigrid.farama.org</code></a>, and <a href="https://miniworld.farama.org/"><code>miniworld.farama.org</code></a>.
- Input: <abstract>In classical computation, a “write-only memory” (WOM) is little more than an oxymoron, and the addition of WOM to a (deterministic or probabilistic) classical computer brings no advantage. We prove that quantum computers that are augmented with WOM can solve problems that neither a classical computer with WOM nor a quantum computer without WOM can solve, when all other resource bounds are equal. We focus on realtime quantum finite automata, and examine the increase in their power effected by the addition of WOMs with different access modes and capacities. Some problems that are unsolvable by two-way probabilistic Turing machines using sub-logarithmic amounts of read/write memory are shown to be solvable by these enhanced automata.</abstract> →
In classical computation, a “write-only memory” (WOM) is little more than an oxymoron, and the addition of WOM to a (deterministic or probabilistic) classical computer brings no advantage.
We prove that quantum computers that are augmented with WOM can solve problems that neither a classical computer with WOM nor a quantum computer without WOM can solve, when all other resource bounds are equal. We focus on realtime quantum finite automata, and examine the increase in their power effected by the addition of WOMs with different access modes and capacities.
Some problems that are unsolvable by two-way probabilistic Turing machines using sub-logarithmic amounts of read/write memory are shown to be solvable by these enhanced automata.
- Input: <abstract>Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called Test-Time Training (TTT) layers. We consider two instantiations: TTT-Linear and TTT-MLP, whose hidden state is a linear model and a two-layer MLP respectively. We evaluate our instantiations at the scale of 125M to 1.3B parameters, comparing with a strong Transformer and Mamba, a modern RNN. Both TTT-Linear and TTT-MLP match or exceed the baselines. Similar to Transformer, they can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context. With preliminary systems optimization, TTT-Linear is already faster than Transformer at 8k context and matches Mamba in wall-clock time. TTT-MLP still faces challenges in memory I/O, but shows larger potential in long context, pointing to a promising direction for future research.</abstract> →
Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state.
We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called <strong>Test-Time Training (TTT)</strong> layers. We consider two instantiations: <strong>TTT-Linear</strong> and <strong>TTT-MLP</strong>, whose hidden state is a linear model and a two-layer MLP respectively.
We evaluate our instantiations at the scale of 125M to 1.3B parameters, comparing with a strong Transformer and Mamba, a modern RNN. Both TTT-Linear and TTT-MLP match or exceed the baselines. Similar to Transformer, they can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context.
With preliminary systems optimization, TTT-Linear is already faster than Transformer at 8k context and matches Mamba in wall-clock time.
TTT-MLP still faces challenges in memory I/O, but shows larger potential in long context, pointing to a promising direction for future research.
- Input: <abstract>This paper examines the relationship between sleep quality and academic performance in college students. Using actigraphy data and self-reported sleep logs from 150 participants over a semester, we analyzed sleep duration, efficiency, and timing in relation to GPA and test scores. Results indicate a complex interaction between sleep metrics and academic outcomes, with some counterintuitive findings. While overall sleep duration showed a weak positive correlation with GPA, sleep timing regularity emerged as a stronger predictor of academic success. However, the effects varied across different academic disciplines and assessment types. These findings highlight the need for nuanced approaches to sleep interventions in academic settings and call for further research into the multifaceted nature of sleep's impact on cognitive performance.</abstract> →
This paper examines the relationship between sleep quality and academic performance in college students.
Using actigraphy data and self-reported sleep logs from 150 participants over a semester, we analyzed sleep duration, efficiency, and timing in relation to GPA and test scores.
Results indicate a complex interaction between sleep metrics and academic outcomes, with some counterintuitive findings. While overall sleep duration showed a weak positive correlation with GPA, sleep timing regularity emerged as a stronger predictor of academic success. However, the effects varied across different academic disciplines and assessment types.
These findings highlight the need for nuanced approaches to sleep interventions in academic settings and call for further research into the multifaceted nature of sleep's impact on cognitive performance.
- Input: <abstract>We present a novel machine learning algorithm for detecting anomalies in time series data. Our approach combines elements of deep learning and statistical process control to identify subtle deviations from expected patterns. The algorithm was tested on diverse datasets including financial market data, industrial sensor readings, and physiological measurements. Performance metrics show significant improvements over existing methods in terms of both accuracy and computational efficiency. However, challenges remain in tuning the algorithm for specific domains and interpreting its decisions. We discuss potential applications in fields such as predictive maintenance, fraud detection, and health monitoring, as well as ethical considerations surrounding the deployment of such systems. Future work will focus on enhancing the algorithm's explainability and adapting it for real-time streaming data.</abstract> →
We present a novel machine learning algorithm for detecting anomalies in time series data.
Our approach combines elements of deep learning and statistical process control to identify subtle deviations from expected patterns. The algorithm was tested on diverse datasets including financial market data, industrial sensor readings, and physiological measurements.
Performance metrics show significant improvements over existing methods in terms of both accuracy and computational efficiency. However, challenges remain in tuning the algorithm for specific domains and interpreting its decisions.
We discuss potential applications in fields such as predictive maintenance, fraud detection, and health monitoring, as well as ethical considerations surrounding the deployment of such systems. Future work will focus on enhancing the algorithm's explainability and adapting it for real-time streaming data.
- Input: <abstract>This study investigates the impact of social media use on political polarization. We conducted a large-scale survey of 10,000 participants across five countries, complemented by analysis of social media activity and network structures. Our findings reveal a nuanced picture of social media's role in shaping political attitudes. While echo chambers and filter bubbles were observed, their effects were moderated by individual differences in media literacy and personality traits. Contrary to popular belief, we found that exposure to diverse viewpoints on social media can sometimes reinforce pre-existing biases rather than reduce them. The relationship between social media use and political polarization appears to be bidirectional and context-dependent. These results have implications for platform design, media policy, and civic education initiatives aimed at promoting democratic discourse in the digital age.</abstract> →
This study investigates the impact of social media use on political polarization.
We conducted a large-scale survey of 10,000 participants across five countries, complemented by analysis of social media activity and network structures.
Our findings reveal a nuanced picture of social media's role in shaping political attitudes. While echo chambers and filter bubbles were observed, their effects were moderated by individual differences in media literacy and personality traits. Contrary to popular belief, we found that exposure to diverse viewpoints on social media can sometimes reinforce pre-existing biases rather than reduce them.
The relationship between social media use and political polarization appears to be bidirectional and context-dependent. These results have implications for platform design, media policy, and civic education initiatives aimed at promoting democratic discourse in the digital age.
- Input: <abstract>In this paper, we explore the potential of quantum computing for solving optimization problems in supply chain management. We developed a quantum algorithm that addresses the vehicle routing problem with time windows, a classic NP-hard problem in logistics. Using a quantum annealer with 2000 qubits, we compared our approach to state-of-the-art classical algorithms. Results show that the quantum approach can find near-optimal solutions for medium-sized instances faster than classical methods, but struggles with larger problem sizes due to current hardware limitations. We discuss the challenges of mapping real-world logistics problems onto quantum architectures and propose hybrid quantum-classical approaches as a promising direction. The paper concludes with a roadmap for scaling quantum optimization techniques to practical supply chain problems, considering both technological advances and algorithm design.</abstract> →
In this paper, we explore the potential of quantum computing for solving optimization problems in supply chain management.
We developed a quantum algorithm that addresses the vehicle routing problem with time windows, a classic NP-hard problem in logistics. Using a quantum annealer with 2000 qubits, we compared our approach to state-of-the-art classical algorithms.
Results show that the quantum approach can find near-optimal solutions for medium-sized instances faster than classical methods, but struggles with larger problem sizes due to current hardware limitations.
We discuss the challenges of mapping real-world logistics problems onto quantum architectures and propose hybrid quantum-classical approaches as a promising direction. The paper concludes with a roadmap for scaling quantum optimization techniques to practical supply chain problems, considering both technological advances and algorithm design.
- Input: <abstract>This review synthesizes current knowledge on the ecological impacts of microplastics in marine ecosystems. We examine evidence from laboratory studies, field observations, and modeling efforts to assess the distribution, bioaccumulation, and effects of microplastics across trophic levels. While clear negative impacts have been demonstrated in controlled experiments, especially for filter-feeding organisms, translating these findings to ecosystem-level consequences remains challenging. Factors such as polymer type, size distribution, and environmental conditions significantly influence microplastic behavior and toxicity. We identify key knowledge gaps, including the long-term effects of chronic exposure, interactions with other pollutants, and potential evolutionary responses of marine organisms. The review also addresses methodological challenges in microplastic research and proposes standardized protocols for sampling and analysis. Finally, we discuss implications for marine conservation policies and suggest priorities for future research to better understand and mitigate the threats posed by microplastics to ocean health.</abstract> →
This review synthesizes current knowledge on the ecological impacts of microplastics in marine ecosystems.
We examine evidence from laboratory studies, field observations, and modeling efforts to assess the distribution, bioaccumulation, and effects of microplastics across trophic levels.
While clear negative impacts have been demonstrated in controlled experiments, especially for filter-feeding organisms, translating these findings to ecosystem-level consequences remains challenging. Factors such as polymer type, size distribution, and environmental conditions significantly influence microplastic behavior and toxicity.
We identify key knowledge gaps, including the long-term effects of chronic exposure, interactions with other pollutants, and potential evolutionary responses of marine organisms. The review also addresses methodological challenges in microplastic research and proposes standardized protocols for sampling and analysis.
Finally, we discuss implications for marine conservation policies and suggest priorities for future research to better understand and mitigate the threats posed by microplastics to ocean health.
- Input: <abstract>Attention mechanisms that confer selective focus on a strict subset of input elements are nearly ubiquitous in language models today. We posit there to be downside to the use of attention: most information present in the input is necessarily lost. In support of this idea we observe poor input representation accuracy in transformers, but find more accurate representation in what we term masked mixers which replace self-attention with masked convolutions.</abstract> →
Attention mechanisms that confer selective focus on a strict subset of input elements are nearly ubiquitous in language models today. We posit there to be downside to the use of attention: most information present in the input is necessarily lost.
In support of this idea we observe poor input representation accuracy in transformers, but find more accurate representation in what we term <strong>masked mixers</strong> which replace self-attention with masked convolutions.
- Input: <abstract>This study evaluates the impact of depression treatment on economic behavior in Karnataka, India. We cross-randomize pharmacotherapy and livelihoods assistance among 1,000 depressed adults and evaluate impacts on depression severity, socioeconomic outcomes, and several potential pathways. When combined, the interventions reduce depression severity, with benefits that persist after treatment concludes. Pharmacotherapy alone has a weaker effect that is only marginally significant and dissipates sooner. Depression treatment does not significantly increase earnings, consumption, or human capital investment in children.</abstract> →
This study evaluates the impact of depression treatment on economic behavior in Karnataka, India.
We cross-randomize pharmacotherapy and livelihoods assistance among 1,000 depressed adults and evaluate impacts on depression severity, socioeconomic outcomes, and several potential pathways.
When combined, the interventions reduce depression severity, with benefits that persist after treatment concludes. Pharmacotherapy alone has a weaker effect that is only marginally significant and dissipates sooner. Depression treatment does not significantly increase earnings, consumption, or human capital investment in children.
- Input: <abstract>Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate reductions in extractable memorization with little to no impact on downstream benchmarks.</abstract> →
Large language models can memorize and repeat their training data, causing privacy and copyright risks.
To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the <strong>goldfish loss</strong>. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set.
We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate reductions in extractable memorization with little to no impact on downstream benchmarks.
- Input: <abstract>To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than relying on summaries or excerpts. In addition, only half of the questions are answerable by annotators working under tight time constraints, indicating that skimming and simple search are not enough to consistently perform well. Our baseline models perform poorly on this task (55.4%) and lag behind human performance (93.5%).</abstract> →
To enable building and testing models on long-document comprehension, we introduce <strong>QuALITY</strong>, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than relying on summaries or excerpts.
In addition, only half of the questions are answerable by annotators working under tight time constraints, indicating that skimming and simple search are not enough to consistently perform well.
Our baseline models perform poorly on this task (55.4%) and lag behind human performance (93.5%).
- Input: <abstract>We propose a novel neural network architecture, the normalized <a href="https://arxiv.org/abs/1706.03762#google">Transformer</a> (nGPT) with representation learning on the hypersphere. In nGPT, all vectors forming the embeddings, MLP, attention matrices and hidden states are unit norm normalized. The input stream of tokens travels on the surface of a hypersphere, with each layer contributing a displacement towards the target output predictions. These displacements are defined by the MLP and attention blocks, whose vector components also reside on the same hypersphere. Experiments show that nGPT learns much faster, reducing the number of training steps required to achieve the same accuracy by a factor of 4 to 20, depending on the sequence length.</abstract> →
We propose a novel neural network architecture, the <strong>normalized <a href="https://arxiv.org/abs/1706.03762#google">Transformer</a> (nGPT)</strong> with representation learning on the hypersphere.
In nGPT, all vectors forming the embeddings, MLP, attention matrices and hidden states are unit norm normalized. The input stream of tokens travels on the surface of a hypersphere, with each layer contributing a displacement towards the target output predictions. These displacements are defined by the MLP and attention blocks, whose vector components also reside on the same hypersphere.
Experiments show that nGPT learns much faster, reducing the number of training steps required to achieve the same accuracy by a factor of 4 to 20, depending on the sequence length.
- Input: <abstract>In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al 2020a) and group-query attention (Ainslie et al 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2–3× bigger. We release all our models to the community.</abstract> →
In this work, we introduce <strong>Gemma 2</strong>, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters.
In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al 2020a) and group-query attention (Ainslie et al 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al 2015) instead of next token prediction.
The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2–3× bigger.
We release all our models to the community.
- Input: <abstract>Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We (1) propose a recipe for uptraining existing multi-head language model checkpoints into models with MQA using 5% of original pre-training compute, and (2) introduce grouped-query attention (GQA), a generalization of multi-query attention which uses an intermediate (more than one, less than number of query heads) number of key-value heads. We show that uptrained GQA achieves quality close to multi-head attention with comparable speed to MQA.</abstract> →
Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference.
We (1) propose a recipe for uptraining existing multi-head language model checkpoints into models with MQA using 5% of original pre-training compute, and (2) introduce <strong>grouped-query attention (GQA)</strong>, a generalization of multi-query attention which uses an intermediate (more than one, less than number of query heads) number of key-value heads.
We show that uptrained GQA achieves quality close to multi-head attention with comparable speed to MQA.
- Input: <abstract>Although domestic cats are among the most common companion animals, we still know very little about the details of the cat-human relationship. With a questionnaire, we asked 157 Hungarian cat owners about their pet’s behavior, cognitive abilities, and social interactions. We analyzed the responses with PCA resulting in 11 traits. The effect of cats’ and owners’ demographic variables on the main components was further analyzed with GLM. The results showed strong similarity to the surveys performed with companion dogs, but we also found features that were mainly cat-specific.</abstract> →
Although domestic cats are among the most common companion animals, we still know very little about the details of the cat-human relationship.
With a questionnaire, we asked 157 Hungarian cat owners about their pet’s behavior, cognitive abilities, and social interactions. We analyzed the responses with PCA resulting in 11 traits. The effect of cats’ and owners’ demographic variables on the main components was further analyzed with GLM.
The results showed strong similarity to the surveys performed with companion dogs, but we also found features that were mainly cat-specific.
- Input: <abstract>Few-shot learners aim to recognize new object classes based on a small number of labeled training examples. To prevent overfitting, state-of-the-art few-shot learners use meta-learning on convolutional-network features and perform classification using a nearest-neighbor classifier. This paper studies the accuracy of nearest-neighbor baselines without meta-learning. Surprisingly, we find simple feature transformations suffice to obtain competitive few-shot learning accuracies. For example, we find that a nearest-neighbor classifier used in combination with mean-subtraction and 𝓁<sub>2</sub>-normalization outperforms prior results in 3⁄5 settings on the miniImageNet dataset.</abstract> →
Few-shot learners aim to recognize new object classes based on a small number of labeled training examples. To prevent overfitting, state-of-the-art few-shot learners use meta-learning on convolutional-network features and perform classification using a nearest-neighbor classifier.
This paper studies the accuracy of nearest-neighbor baselines without meta-learning.
Surprisingly, we find simple feature transformations suffice to obtain competitive few-shot learning accuracies.
For example, we find that a nearest-neighbor classifier used in combination with mean-subtraction and 𝓁<sub>2</sub>-normalization outperforms prior results in 3⁄5 settings on the miniImageNet dataset.
- Input: <abstract>Instruct (or “chat”) tuned models have become the primary way in which most people interact with large language models. As opposed to “base” or “foundation” models, instruct-tuned models are optimized to respond to imperative statements. We present Hermes 3, a neutrally-aligned generalist instruct and tool use model with strong reasoning and creative abilities. Its largest version, Hermes 3 405B, achieves state of the art performance among open weight models on several public benchmarks.</abstract> →
Instruct (or “chat”) tuned models have become the primary way in which most people interact with large language models. As opposed to “base” or “foundation” models, instruct-tuned models are optimized to respond to imperative statements.
We present <strong>Hermes 3</strong>, a neutrally-aligned generalist instruct and tool use model with strong reasoning and creative abilities.
Its largest version, Hermes 3 405B, achieves state of the art performance among open weight models on several public benchmarks.
- Input: <abstract>It is well known that dogs are capable of following human verbal instructions. However, very little is known about the equivalent ability in cats. In this study, we used a switched stimuli task to examine whether cats rapidly form picture-word association, which is a fundamental ability for word learning. We presented cats with two meaningless picture-word combinations, in the habituation phase. Then, on half of the trials we switched the combination (switched condition), but the other half of the trials remained as before (non-switched condition). If cats rapidly form picture-word association, they were expected to look at the monitor for longer in the switched condition, reflecting detection of the change. We used human speech as stimuli in Exp.1, and mechanical sounds (electronic sounds) in Exp.2. Cats expressed detection of the switched combination in Exp.1, where human speech and objects were paired. However, in Exp.2 where non-social sounds and objects were paired, there was no statistical difference between switched and non-switched conditions, although there was a main effect of condition when the data from the two experiments were pooled. These results demonstrate that cats can rapidly form picture-word association. Further research should investigate whether domestication has played a role in this ability.</abstract> →
It is well known that dogs are capable of following human verbal instructions. However, very little is known about the equivalent ability in cats.
In this study, we used a switched stimuli task to examine whether cats rapidly form picture-word association, which is a fundamental ability for word learning. We presented cats with two meaningless picture-word combinations, in the habituation phase. Then, on half of the trials we switched the combination (switched condition), but the other half of the trials remained as before (non-switched condition). If cats rapidly form picture-word association, they were expected to look at the monitor for longer in the switched condition, reflecting detection of the change. We used human speech as stimuli in Exp.1, and mechanical sounds (electronic sounds) in Exp.2.
Cats expressed detection of the switched combination in Exp.1, where human speech and objects were paired. However, in Exp.2 where non-social sounds and objects were paired, there was no statistical difference between switched and non-switched conditions, although there was a main effect of condition when the data from the two experiments were pooled.
These results demonstrate that cats can rapidly form picture-word association. Further research should investigate whether domestication has played a role in this ability.
- Input: <abstract>The decade from 2010 to 2020 saw remarkable improvements in automatic speech recognition. Many people now use speech recognition on a daily basis, for example to perform voice search queries, send text messages, and interact with voice assistants like Amazon Alexa and Siri by Apple. Before 2010 most people rarely used speech recognition. Given the remarkable changes in the state of speech recognition over the previous decade, what can we expect over the coming decade? I attempt to forecast the state of speech recognition research and applications by the year 2030. While the changes to general speech recognition accuracy will not be as dramatic as in the previous decade, I suggest we have an exciting decade of progress in speech technology ahead of us.</abstract> →
The decade from 2010 to 2020 saw remarkable improvements in automatic speech recognition. Many people now use speech recognition on a daily basis, for example to perform voice search queries, send text messages, and interact with voice assistants like Amazon Alexa and Siri by Apple. Before 2010 most people rarely used speech recognition.
Given the remarkable changes in the state of speech recognition over the previous decade, what can we expect over the coming decade? I attempt to forecast the state of speech recognition research and applications by the year 2030.
While the changes to general speech recognition accuracy will not be as dramatic as in the previous decade, I suggest we have an exciting decade of progress in speech technology ahead of us.
- Input: <abstract><p>The advent of generative AI images has completely disrupted the art world. Distinguishing AI generated images from human art is a challenging problem whose impact is growing over time. A failure to address this problem allows bad actors to defraud individuals paying a premium for human art and companies whose stated policies forbid AI imagery. It is also critical for content owners to establish copyright, and for model trainers interested in curating training data in order to avoid potential model collapse. There are several different approaches to distinguishing human art from AI images, including classifiers trained by supervised learning, research tools targeting diffusion models, and identification by professional artists using their knowledge of artistic techniques. In this paper, we seek to understand how well these approaches can perform against today’s modern generative models in both benign and adversarial settings. We curate real human art across 7 styles, generate matching images from 5 generative models [DALL·E 3, Midjourney v6, SDXL, Firefly, Civitai], and apply 8 detectors (5 automated detectors and 3 different human groups including 180 crowdworkers, 4000+ professional artists, and 13 expert artists experienced at detecting AI). Both Hive and expert artists do very well, but make mistakes in different ways (Hive is weaker against adversarial perturbations while Expert artists produce higher false positives). We believe these weaknesses will remain as models continue to evolve, and use our data to demonstrate why a combined team of human and automated detectors provides the best combination of accuracy and robustness.</p> →
<p>The advent of generative AI images has completely disrupted the art world. Distinguishing AI generated images from human art is a challenging problem whose impact is growing over time. A failure to address this problem allows bad actors to defraud individuals paying a premium for human art and companies whose stated policies forbid AI imagery. It is also critical for content owners to establish copyright, and for model trainers interested in curating training data in order to avoid potential model collapse.</p>
<p>There are several different approaches to distinguishing human art from AI images, including classifiers trained by supervised learning, research tools targeting diffusion models, and identification by professional artists using their knowledge of artistic techniques.</p>
<p>In this paper, we seek to understand how well these approaches can perform against today’s modern generative models in both benign and adversarial settings. We curate real human art across 7 styles, generate matching images from 5 generative models [DALL·E 3, Midjourney v6, SDXL, Firefly, Civitai], and apply 8 detectors (5 automated detectors and 3 different human groups including 180 crowdworkers, 4000+ professional artists, and 13 expert artists experienced at detecting AI).</p>
<p>Both Hive and expert artists do very well, but make mistakes in different ways (Hive is weaker against adversarial perturbations while Expert artists produce higher false positives).</p>
<p>We believe these weaknesses will remain as models continue to evolve, and use our data to demonstrate why a combined team of human and automated detectors provides the best combination of accuracy and robustness.</p>
- Input: <abstract>The rise of algorithmic pricing raises concerns of algorithmic collusion. We conduct experiments with algorithmic pricing agents based on Large Language Models (LLMs). We find that (1) LLM-based agents are adept at pricing tasks, (2) LLM-based pricing agents autonomously collude in oligopoly settings to the detriment of consumers, and (3) variation in seemingly innocuous phrases in LLM instructions (”prompts”) may increase collusion. Novel off-path analysis techniques uncover price-war concerns as contributing to these phenomena. Our results extend to auction settings. Our findings uncover unique challenges to any future regulation of LLM-based pricing agents, and black-box pricing agents more broadly.</abstract> →
The rise of algorithmic pricing raises concerns of algorithmic collusion.
We conduct experiments with algorithmic pricing agents based on Large Language Models (LLMs).
We find that (1) LLM-based agents are adept at pricing tasks, (2) LLM-based pricing agents autonomously collude in oligopoly settings to the detriment of consumers, and (3) variation in seemingly innocuous phrases in LLM instructions (”prompts”) may increase collusion. Novel off-path analysis techniques uncover price-war concerns as contributing to these phenomena. Our results extend to auction settings.
Our findings uncover unique challenges to any future regulation of LLM-based pricing agents, and black-box pricing agents more broadly.
- Input: <abstract><p>We explore the question: “How much prior art knowledge is needed to create art?” To investigate this, we propose a text-to-image generation model trained without access to art-related content. We then introduce a simple yet effective method to learn an art adapter using only a few examples of selected artistic styles. Our experiments show that art generated using our method is perceived by users as comparable to art produced by models trained on large, art-rich datasets. Finally, through data attribution techniques, we illustrate how examples from both artistic and non-artistic datasets contributed to the creation of new artistic styles.</p></abstract> →
<p>We explore the question: “How much prior art knowledge is needed to create art?”</p>
<p>To investigate this, we propose a text-to-image generation model trained without access to art-related content. We then introduce a simple yet effective method to learn an art adapter using only a few examples of selected artistic styles.</p>
<p>Our experiments show that art generated using our method is perceived by users as comparable to art produced by models trained on large, art-rich datasets.</p>
<p>Finally, through data attribution techniques, we illustrate how examples from both artistic and non-artistic datasets contributed to the creation of new artistic styles.</p>

(Reminder: your task is to split into multiple logical paragraphs by topic.)

- Input: <abstract>{target}</abstract> →
"""}
  ]
)

print(completion.choices[0].message.content)