Ðóñ Eng Cn Translate this page:
Please select your language to translate the article


You can just close the window to don't translate
Library
Your profile

Back to contents

Historical informatics
Reference:

Text and knowledge in the aspect of large language models

Orekhov Boris Valer'evich

ORCID: 0000-0002-9099-0436

PhD in Philology

Senior Researcher, Laboratory of Digital Research of Literature and Folklore, Institute of Russian Literature (Pushkin House) of the Russian Academy of Sciences

119331, Russia, Moscow, Krupskaya str., 13, sq. 77

nevmenandr@gmail.com

DOI:

10.7256/2585-7797.2023.4.44180

EDN:

BJQBQB

Received:

30-09-2023


Published:

31-12-2023


Abstract: The focus of this text is on the influence of large linguistic models on the self-determination of the humanities. Large language models are able to generate plausible texts. It seems that they thus become on a par with other tools that, throughout the development of technology have freed people from routine. At the same time, for the humanities, the individualization of the generated texts is very great, and knowledge itself is closely related to its textual embodiment. If we agree that knowledge is a text, and embodied in another text, another knowledge appears before us, then humanities will have to answer the question of how a text generated by a person differs in value from the same text generated by a machine. The text of the work raises methodological and epistemological problems of the correlation of texts of natural and artificial origin if they are made in the genre of a scientific work. The difference between such artifacts is clearly visible only for some scientific disciplines, and raises questions about the rest. These issues should be resolved with the help of deep reflection, which was not so urgently needed in the last centuries of the development of the humanities, but which is now required from a humanitarian scientist. The humanitarian will have to explicitly oppose himself to large language models and prove the importance of his work compared to what a neural network can generate.


Keywords:

large language models, chatgpt, scientific publications, methodology of science, text generators, knowledge, the science, text, formal languages, Humanities

This article is automatically translated. You can find original text of the article here.

 

Introduction

 

"Historians write texts," is how V. Kansteiner begins his article [12]. And in this statement, indeed, the key problem of the relationship between artificial intelligence (AI) and the historian, and even more broadly, any scientist from the camp of humanities.

The only difficulty that V. Kansteiner sees in modern, not yet fully perfect large language models (LLM) is that they do not know how to distinguish truth from falsehood, but this difficulty should also have a technical solution in the perspective of connecting a text generator with knowledge bases. If we accept this thesis, then the main nerve of the discussion will lie somewhere in other areas, for example, in the ethics of LLM. Therefore, the author of the article quickly switches to moral issues, building thought experiments involving Hitler, the memory of the Holocaust and similar emotionally loaded problems.

By the way, when Kansteiner touches on the problem of truth, he talks about it as if this concept itself is not problematic. Actually, in history, as F. shows. Ankersmith [1], there may be an infinite number of narratives, and none of them can reasonably claim to be the "truth". This is also a well-known problem that has received its reflection for a long time.

At the same time, it seems that the appearance of LLM poses more complex conceptual issues of self-determination for the humanities scientist, which had not been heard in this field before. They touch on the most basic foundations of scientific activity, and it is the successes of neural networks that provoke such questions to be asked.

Indeed, "historians write texts." The texts are also written by representatives of other sciences. But the stereotypical images of the scientist and science bring to the fore something completely different: not the text, but knowledge. It is knowledge and truth that become the most significant results of scientific activity for an outside view.

The ChatGPT system itself defines science in this way: "Science is a systematic and organized approach to the study of nature, society and various phenomena based on observation, experiments and logical analysis of data. The goal of science is to expand knowledge and understanding of the world, create new technologies and solve practical problems." The texts are not mentioned here. If we follow this definition, texts for scientists do not seem to be the main product of their work. The main thing in science as an activity is to gain new knowledge. In the ultimate sense, scientific discoveries.

The question of alienation of knowledge from the text

In an idealized schematic world, a scientist really produces knowledge that does not depend on textual embodiment. G. Gachev quotes K. Gauss: "As long as we think about the subject directly only in our own mind, we do not need names or signs. They become necessary only when we want to explain it to others."[3]

In reality, the relationship between a scientist and his product is more complicated. The very evaluation of a scientist's work is his publication activity. Since it is impossible to formally and strictly assess the significance of a scientific idea (or even its existence), all that administrators can measure is the number and status of published texts. The principle of "Publish or perish" illustrates very concretely the importance of factors influencing the employment of a scientist and the success of his scientific career.

In this case, the main question that the appearance of LLM poses to the scientist is the question of whether the scientist is able to do something in which LLMs cannot repeat it.

Scientists can generate texts, LLMs can generate texts too. But does a scientist create the very knowledge referred to in the definitions of science? And further: is there knowledge that science should be searching for? If the text in which knowledge is embodied and knowledge itself are inseparable, then LLMs may well repeat and, therefore, replace scientists by producing texts. Perhaps knowledge and truth are myths, and scientists are just imitating their search, and in this sense they are no different from LLMs, whose main activity is imitation? The unpleasant nature of these issues is that they have institutional consequences: is it worth allocating budgets for imitation activities of people if cheaper language models successfully cope with it?

In the drawn diagram, representatives of the humanities and natural sciences seem to be in different positions. It is easier to imitate a humanitarian text than a natural science one. However, the answers to these questions are not completely unambiguous for the natural sciences either. Thanks to the work of B. Latour [7] and due to the awareness of the crisis of reproducibility in the natural sciences, we know that research is difficult to separate from the laboratory in which it was performed, and, consequently, from the text that was created by the staff of this laboratory for subsequent publication. Is what is said in a scientific article on chemistry or physics alienated from the text of this scientific article?

Any answer to this question will be debatable, but it is symptomatic of the desire to consider such scientific works as text-centric, in other words, to consider them in a way in which the features of textual embodiment are part of the postulated knowledge.

In G. Gachev's book, this is stated quite clearly, and science in terms of form is compared with fiction: "science is taken <...> not only as a sum of ideas, experiences, knowledge, but also as scientific literature, and each scientific work is considered not only as an exposition of certain views on a natural phenomenon, but also as a text in natural language, where the image and style are of fundamental importance for the recreated picture of the world and the constructions of the theory" [3]. Latour discusses the rhetoric of scientific articles in a similar vein, showing what textual techniques their authors resort to, striving for maximum impact on the reader [7]. Less titled, but no less thoughtful authors also talk about how closely the text correlates with science: "Writing plays such a central role in cognition, learning, and research that it's amazing how little we think about it" [6, p. 9].

Now, thanks to LLM, we have the right to ask ourselves whether science is not just an activity that is (additionally) influenced by image and style, but something that does not fundamentally differ from fiction with its images and style.

Natural sciences use formal languages, for example, an article on chemistry may contain formulas for a chemical reaction. But LLMs can also learn formal languages and generate statements in them, as has already happened with natural languages. Chemical formulas can be generated in model responses right now. Apparently, the logical procedure to clarify the alienation of knowledge from a statement in a formal language is to establish the possibility of creating a narrative that would reformulate such a statement, and to what extent such a narrative is equivalent to this statement.

If a paraphrastic narrative is possible and if it is equivalent to a statement in a formal language, then knowledge in the natural sciences is fundamentally alienated from the text. As we have already said, there are difficulties described by B. Latour, but they will have the character of private considerations.

If the narrative does not correspond to the statement in a formal language, then the problem of the status of knowledge does not concern purely humanitarians, but is common to all sciences. In other words, representatives of the natural science community are just as dependent on the text for the results of their activities as representatives of the humanitarian community.

There is also a plane in the conversation on this topic in which the historical sciences turn out to be closer to the natural than to the humanities — this is correlation with reality. If an article on chemical or physical topics mentions an experiment that no one actually conducted, such an incident will roughly correspond to the mention of a "fact" in a historical study that has no confirmation in the sources. V. Kansteiner examines such cases in detail in his text, but it is important for us that this circumstance is essential for the demarcation of sciences of different types: in literary analysis, it is much more difficult to detect such "facts" having a binary status ("true" or "false"). Interpretations of artistic images, their embeddability into systems of higher orders (figurative systems and motivic structures of cycles, author's buildings, buildings of the epoch and style) can only be more or less convincing. Persuasion is a category of rhetoric, not knowledge, and rhetoric is actualized only in a specific text.

N. Luhmann describes the history of approaches to new knowledge in this way: "Before Kuhn, all earlier descriptions of the world that did not correlate with recent research were considered as more or less failed attempts to obtain scientific knowledge" [13, pp. 10-11]. Thanks to Kuhn, "outdated" theories began to be considered as originating from other paradigms, and, therefore, not as incorrect, but as alternative ones. "One can only say: we are dealing with a different paradigm, whose claims to primacy in can only be formulated in its own terms" [13, p. 11]. By and large, the peculiarity of the formulation is also a category of rhetoric.

It is impossible to consider as true or false a statement like this: "the essence of Pasternak's poetic world is everyday life and ecstasy" [11, p. 210] or such: "And the <nfinitive>N<ismo> treats about a certain virtual reality that the poet holds in front of his mind's eye, about a certain "there", unlike from another minimalist style—called, which paints the described as taking place here and now" [5, p. 250].

In this example, we see that the knowledge produced by the nuclear humanities (that is, in this case, we do not consider borderline cases like linguistics or history) is inseparable from the text. Deeply individualized works of an analytical nature reveal the source of the arguments of each of the cited scientists. For K. Taranovsky, this is a classical hermeneutic tradition, and for A. K. Zholkovsky— it is structural linguistics with its reliance on the functionality of parts of speech.

Forgotten literary works of the Soviet era also confirm the close connection of the text with humanitarian knowledge. These works were forgotten precisely for the reason that they are not correlated with the current ideological agenda (in some way an analogue of the Kuhn "paradigm" in the humanities). Russian Russian Realism abstract text: "The scientific discussion about Russian realism, compromised by Soviet literary criticism, was interrupted in the post-Soviet era" [8].

Even positivist knowledge of a biographical or textual nature does not exist separately from its argumentation, which means it exists within the same text. Thus, the argumentation of the dating of Tyutchev's poem "Long ago, long ago, O Blessed South..." connects the facts of direct observations by the poet, denying his right to fantasy: "K. V. Pigarev suggested that this poem was "written in December 1837, upon returning from Genoa to northern Italy, to Turin." It is impossible to agree with this assumption. The accuracy of landscape sketches in Tyutchev's lyrics is well known, and this cannot be ignored when dating his poems. In this case, K. V. Pigarev's assumption is contradicted by the entire text of the poem, where the “blessed” South, the “azure plain” of its sea is opposed by the harsh North <...>. This landscape in no way corresponds to the landscape of Turin and its surroundings, located in the foothills of the Alps, at the same latitude as Genoa (although, of course, the climate of these places seems more severe in comparison with the mild climate of Genoa)" [4, p. 288]. Here the italics are mine — B. O. We are talking about falsifiable historical knowledge, but the way to establish it is based on an unconventional category of persuasiveness.

But still, unlike the natural sciences, where the basis for the text is an experiment, in the humanities there is no such empirical base, and therefore the separability of a scientific text from knowledge along the axis of factual reliability seems even more problematic. Until the current situation involving the neural network factor, this problem was not felt so acutely.

M. L. Gasparov's reflection on the interpretations of Ovid's exile from Rome is indicative in this sense: "Hypotheses about what the 'offense' consisted of Ovid, for more than five centuries of philological science, so much has accumulated that a recent review of them took up a rather thick book, and the list attached to it is far from complete and contains 111 reasoned opinions. At the same time, in different epochs, variants of different hypotheses prevailed in the most curious way. The first period is the Middle Ages and the Renaissance: Ovid's commentators do not yet have any material other than Ovid's texts and their own fantasy, and this fantasy is not rich. It was all the fault of pagan debauchery: if Ovid committed something, then it was adultery with the wife or daughter of the emperor, if Ovid saw something, then it was the emperor indulging in sodomy or incest with his own daughter, or maybe the empress coming out of the bathhouse. The second period is by the XVIII century. historians understand the persons and dates, the simultaneity of Ovid and Julia's exile is revealed, a new version is presented. It's all the fault of a love story: if Ovid is guilty by deed, then he was the lover of Julia the Younger, if guilty by sight, then he was a witness, and maybe an accomplice of the love of Julia and Silanus. The third period, the sober 19th century, shifts interest from the romantic aspect of events to the political one: Ovid suffered for participating in (or at least knowing about) a conspiracy allegedly organized against Augustus by Julius and Aemilius Paulus in order to enthrone Agrippa Postumus. The fourth period — at the end of the XIX century. attention is awakened to the dark, irrational side of the ancient world: Ovid's unspeakable guilt turns out to be not political, but religious, he either violated the charter of some undisclosed mysteries (in honor of Isis, in honor of Eleusinian Demeter, in honor of the Roman Good Goddess), or participated in magical divination about the fate of the emperor. Finally, the fifth period comes — and the XX century, which survived fascism and other forms of totalitarianism, says: no offense was named at all, Ovid was told: “you are to blame — you are being punished; and what is to blame — you must understand yourself”; and all Ovid's repentances are so incomprehensible precisely because he himself did not he knows what he is guilty of."[2] Here, the philologist draws attention to the relativity of humanitarian knowledge, its dependence on the ideological context, and its ultimate reducibility to the same category of persuasiveness: "All these theories (except, perhaps, the earliest ones) are almost plausible, but none of them is completely convincing."

It should be clarified that such a text-centric status of the humanities has not always been mandatory. For Plato, as follows from Phaedrus, the form of dialogue was not the final form of expression. The teaching existed orally, and the book was considered vulnerable because it could not stand up for itself in a philosophical dispute, always said the same thing (the philosopher can vary the theses), and was not oriented to the interlocutor (the written text addresses everyone at once, and the philosopher — to a specific addressee, and therefore acts more subtly and sharper) [10]. Thus, knowledge was alienated from the text, and the written text does not seem to be the best form of its representation.

In a sense, the following passage in Euthydemus (289b et seq.) can be interpreted as a critique of LLM:

"Therefore, my beautiful boy," I continued, "we need knowledge that combines the ability to do something and the ability to use what has been done.

"That's clear," he replied.

— So, as you can see, we do not need to become skilled in making lyres and dexterous in such skills. After all, here the art of making and the art of using exist separately, although they relate to the same subject, because the art of making lyres and the art of playing them are very different from each other. Isn't it?

"But in the name of the gods," I said, "if we learn the art of making speeches, is it the acquisition of this art that will make us happy?"

"I don't think so," replied Clinius, seizing on my thought.

— And how can you justify this? - I asked.

— I know some speech makers who do not know how to use their own speeches, which they themselves have composed, just as lyre makers do not know how to use lyres. At the same time, there are other people who know how to use what they first cooked, although they do not know how to cook speeches themselves. It is clear that in the matter of making speeches, the art of making is one thing, and the art of application is another.

Plato speaks about the art of composing speeches (? , 289c7), which actually constitutes the essence of LLM skills. At the same time, the inability to achieve happiness is the main way of criticism for a philosopher.

How criticism should be perceived and the irony of J. Swift: "But the world will soon appreciate the usefulness of this project; and he flattered himself that a more sublime idea had never been conceived in anyone's head. Everyone knows how difficult it is to study the sciences and arts according to the generally accepted method; meanwhile, thanks to his invention, the most ignorant person, with the help of moderate expenses and little physical effort, can write books on philosophy, poetry, politics, law, mathematics and theology with a complete lack of erudition and talent."

Conclusion

The alienation of knowledge from the text is a fundamental problem that, having almost touched upon, V. Kansteiner did not notice. In itself, it is not new, and has already been realized by humanitarians (this is partly evident from the text by M. L. Gasparov about the exile of Ovid). But thanks to LLM, her status is changing. It turns from abstract-theoretical to concrete-methodological.

Neural networks primarily pose the question to humanitarian science about what is in it besides the text, whether the signs in the scientific text have a referent. It is possible that in the near future, humanities scientists will have to show and prove the presence of a referent for an external observer.

So far, LLMs do not analyze texts well (for example, poetic ones): Instead of the expected work-oriented sketches, they generate sequences of common words applied to an arbitrary composition. It is possible that with an increase in the number of parameters and with genre specialization, this problem will be overcome, and human superiority in this area will not last long.

Nevertheless, the individualization of humanitarian texts can be an important mechanism that distinguishes the generated text from the natural one. Neural networks produce an "arithmetic mean" of the training data. That is, the technology itself fundamentally contradicts the idea of originality. Paradoxically, originality manifests itself when the model makes mistakes. It follows that the better the LLM is trained, the less original it is.

Originality and the genius of a scientist correlated with it is the mythology of the Romantic era. Romanticism imposes on society the myth of a brilliant artist, musician, poet and scientist who overcome the mossy views of their contemporaries. Confusion before the success of neural networks is confusion before the fact that the myth is not justified, before the fact that an electronic device, devoid of consciousness, repeats what, according to the myth, only a brilliant person was capable of. This explains, among other things, that the success of LLM has had a significant media effect.

Perhaps LLMs will force humanitarians to return to the romantic myth of individualized science and rely on it in the fight against a digital rival. After all, if the abilities of the language model are enough to imitate a scientific text, then this means that science is just a technology, the "arithmetic mean" of what is read. And every humanitarian knows that this is not the case, and is constantly looking for an increment of meaning.

References
1. Ankersmith, F. R. (2003). History and Tropology. The rise and fall of metaphor. Moscow: Progress-Tradition.
2. Gasparov, M. L. (1978). Ovid in exile. Publius Ovidius Nazo. Sorrowful elegies. Letters from Pontus. Moscow: Nauka, 189-224.
3. Gachev, G. D. (1993). Science and national culture (humanitarian commentary on natural science). Rostov-on-Don: Rostov University pub.
4. Dinesman, T. G. (1999). On the dating and addressees of some of Tyutchev's poems. Chronicle of the life and work of F. I. Tyutchev. Moscow: LLC Lithograph, Muranovo: Museum Muranovo. Book 1, 277-290.
5. Zholkovsky, A. K. (2003). Infinitive writing: tropes and plots (Materials to the topic). Etkind's Readings: a collection of articles based on the materials of the Readings in memory of E. G. Etkind. St. Petersburg, 250-271.
6. Zonke, A. (2022). How to make useful notes. An Effective System of Organising Ideas by the Zettelkasten Method. Moscow: Mann, Ivanov and Ferber.
7. Latour, B. (2013). Science in Action. St. Petersburg: Publishing house of the European University in St. Petersburg.
8. Vdovin, A. V. (Ed.) (2020) Russian realism of the 19th century. Society, knowledge, narrative. Moscow: New Literary Review.
9. Swift, J. (2008). Collected Works: In 3 vol. Ò. 1: The Tale of the Barrel; Gulliver's Travels. Moscow: TERRA - Book Club.
10. Slezak, T. A. (2008). How to read Plato. St.-Petersburg: St.-Petersburg University pub.
11. Taranovsky, K. (2000). On Poetry and Poetics. Moscow: Languages of Russian Culture.
12. Kansteiner, W. (2022). Digital Doping For Historians: Can History, Memory, And Historical Theory Be Rendered Artificially Intelligent? History and Theory, 61(4), 119-133.
13. Luhmann, N. (1994). The modernity of science. New German Critique, 61, 9-23.

Peer Review

Peer reviewers' evaluations remain confidential and are not disclosed to the public. Only external reviews, authorized for publication by the article's author(s), are made public. Typically, these final reviews are conducted after the manuscript's revision. Adhering to our double-blind review policy, the reviewer's identity is kept confidential.
The list of publisher reviewers can be found here.

The last few decades have been marked by a growing interdisciplinarity in science. Indeed, what is worth is the bioethics developed by V.R. Potter, who sagaciously noticed the importance of ethical understanding of new medical technologies. However, modern information and communication technologies play an equally important role in the historical sciences: here we mean historical informatics, the first work within which in the Soviet Union began back in the 1970s. These circumstances determine the relevance of the article submitted for review, the subject of which is large language models. The author sets out to analyze the issue of alienation of knowledge from the text, to consider the problem of neural networks in the context of the humanities, as well as to identify the causes of confusion of contemporaries in front of neural networks. The work is based on the principles of analysis and synthesis, reliability, objectivity, the methodological basis of the research is a systematic approach, which is based on the consideration of the object as an integral complex of interrelated elements. The scientific novelty of the article lies in the very formulation of the topic: the author seeks to characterize the problem of the relationship between artificial intelligence and the historian. Considering the bibliographic list of the article as a positive point, its versatility should be noted: in total, the list of references includes 13 different sources and studies. The undoubted advantage of the reviewed article is the attraction of foreign English-language literature, which enhances the scientific novelty. Among the studies used by the author, we point to the works of V. Kansteiner and N. Luhmann, which focus on various aspects of the study of approaches to new knowledge. Note that the bibliography is important both from a scientific and educational point of view: after reading the text of the article, readers can turn to other materials on its topic. In general, in our opinion, the integrated use of various sources and research contributed to the solution of the tasks facing the author. The style of writing the article can be attributed to scientific, at the same time understandable not only to specialists, but also to a wide readership, to anyone interested in both the influence of neural networks on the development of the humanities in general, and the relationship of the text and new knowledge in particular. The appeal to the opponents is presented at the level of the collected information received by the author during the work on the topic of the article. The structure of the work is characterized by a certain logic and consistency, it can be distinguished by an introduction, the main part, and conclusion. At the beginning, the author defines the relevance of the topic, shows that "the main question that the appearance of LLM poses to a scientist is the question of whether a scientist is capable of doing something in which LLMs cannot repeat it." The author draws attention to the fact that "unlike the natural sciences, where the basis for the text is an experiment, there is no such empirical base in the humanities, and therefore the separability of scientific text from knowledge along the axis of factual reliability," while this problem was not seen as so relevant until the era of neural networks. The paper shows that thanks to LLM, the problem of alienation of knowledge from the text turns from an abstract theoretical one into a concrete methodological one. The author's opinion is fair that "neural networks primarily pose the question to humanitarian science about what is in it besides the text, whether signs in the scientific text have a referent." The main conclusion of the article is that "the individualization of humanitarian texts can be an important mechanism that distinguishes the generated text from the natural one." The article submitted for review is devoted to an urgent topic, will arouse readers' interest, and its materials can be used both in training courses and in the framework of studying the problems of interaction between artificial intelligence and a humanitarian scientist. In general, in our opinion, the article can be recommended for publication in the journal "Historical Informatics".