- Summaries must be one sentence long and may not exceed 380 characters
- Be as informative as possible, while taking care that the summary is comprehensible and easy enough to follow
- For example, it is normal to include time, place and manner information. This summary contains too little detail:
"Bank robbers plundered a vault and escaped"
A better summary is:
"On March 23, 1999, five bank robbers plundered the vault of First National Bank in Poughkeepsie, NY and escaped in a bus they had stolen."
- By contrast, including too much information is not helpful and difficult to follow. The objective of the summary is not to paraphrase the entire text by converting it into a monstrous sentence. This is too much information and hard to read:
"Five bank robbers, all masked, one tall and middle aged and four probably in their twenties based on eye witness accounts, plundered a 1990 Wilson Safe Co. vault at the First National Bank branch in Poughkeepsie, NY which was established in 1975 and had never experienced a bank robbery in its history before the robbery which took place on March 23, 1999, and from which the robbers escaped by stealing a bus from Dutchess County Public Transit ahead of the robbery and driving it away from the bank after a robbery which took no more than 45 minutes to carry out."
- Summaries should avoid reliance on knowledge not present in the text, even if the summarizer is reasonably sure that it is true.
- Summaries are meant to be:
- Abstractive (no need to quote the text verbatim, although it is not forbidden to do so)
- Substitutive (they replace reading the text as best as possible in one sentence - they are not just meant to attract readers to reading the text; informative, not indicative)
- Focused on the most salient entities in the document, roughly expressing "who did what to whom (and when/how/why)"
- Summaries are NOT:
- The same thing as headlines (though they can be)
- Fragments - avoid "A description of ..."; prefer: "this text describes..."
- Avoiding entities not in the text: summaries should ideally not mention entities which are not actually mentioned in the text.
- Avoiding paraphrasing/synonyms if not necessary: summaries should prefer using the language in the text to describe things and events. For example, if a text says “X was killed by…”, prefer to keep that verb if possible, rather than replacing it with “X died when…”
- Shell nouns should be avoided, but not at all costs: sometimes it is impossible to avoid using a noun describing the text, even if it is not mentioned in the text. Typical examples include academic "this paper presents..." (even if the document's text span does not contain a meta-linguistic reference to the article itself)
¶ Academic and textbooks
- Very often introduced by a shell noun of the type "this paper/experiment/study presents/shows/argues that...", even when that noun is not explicitly mentioned in the text
- In order for the summary to be substitutive, it is recommended to include some content beyond the topic of a textbook section, for example the first summary here is much shorter than the second, but we prefer the second summary, because the first is much less useful as a substitute for the entire text:
- This excerpt explains why specialization of labor increases production.
- This excerpt explains three reasons why specialization of labor increases production: it allows workers to specialize in work they have a talent for, it allows them to improve particularly in certain tasks, and it allows businesses to take advantage of economies of scale, such as by setting up assembly lines to lower production costs.
This paper presents a Web crawler-based computer system for automatically generating census data for an academic field, which is applied to the compilation of a census of faculty in Computer Science.
Using literature on 31 languages and recorded data from 10 languages, this paper argues that 'huh' is a universal word used across languages by listeners to initiate repair in conversation when they do not understand speakers.
This section of a textbook explains and exemplifies different types of government, including representative democracy as in the United States, direct democracy as in ancient Athens, monarchy as in Saudi Arabia, oligarchy as in Cuba's reigning Communist Party, and totalitarianism as in North Korea.
- Bios and other texts centered around an individual:
- typically take the form "Kim is/was a French X who ... "
- typically include information about what this person is/was known for ("... best known for...")
- information about the time period and place is typically included ("a Japanese X", "a German X living in France", "a 19th century Kenyan X")
Jared Padalecki is an award winning American actor who gained prominence in the series Gilmore Girls, best known for playing the role of Sam Winchester in the TV series Supernatural, and for his active role in campaigns to support people struggling with depression, addiction, suicide and self-harm.
Jenna Nicole Mourey, better known as Jenna Marbles, is a very successful American YouTube personality, vlogger, comedian and actress, known for her videos "How To Trick People Into Thinking You're Good Looking" and "How To Avoid Talking To People You Don't Want To Talk To".
- Avoid mentioning speaker names not spelled out in the text (i.e. if we know from metadata that speaker 2 is called Kim, but no one mentions this, then Kim should not be named in the summary)
- However do prefer mentioning speaker names if they are mentioned in the conversation ("Kim and Yun discuss what to have for dinner")
- If no names are mentioned, it is normal to infer some relations from the conversation such as "some friends discuss" or "two children discuss with their parents". This is only done if an ordinary reader should be able to infer such a relation simply from reading the text (and not the metadata).
Dan, Judy, mom and dad open and discuss presents together on Christmas, including a book by Joseph Campbell, a yellow sweatshirt and a stainless steel bread baking pan.
Sometimes a conversation will have two or more totally different sections, in which case we simply summarize both and join them into a single sentence:
Beth, Sherry and two other friends discuss Kathy, who can't have children due to medical reasons, Bill, who works as a carpenter replacing a barn floor, and Carolyn, who reacted rudely to their friend Barb's illness, as well as their own preferences for drinking tea with or without lemon.
- In non-metalinguistic texts (i.e. fiction itself, not texts about fiction), summarize the text as if it is a literal, true story; for example, "Huckleberry Finn is fishing", not "In this extract from the novel Huckleberry Finn, fictional character Huck is..."
- Even if described events are factually incorrect, or involve science fiction or imaginary contexts, we summarize without commenting on this ("Three unicorns chat and decide to go fishing")
- Unnamed active protagonists should be referred to as "a/the protagonist"
- An unnamed narrator who is not an agent in the story can be referred to as "a/the narrator"
Jacques Chalmers, a starfighter pilot for the Empire, is terrified of overwhelming enemy forces as he leaves his deployment carrier together with his comrades, and later narrowly escapes the Enemy after witnessing the destruction of the Kethlan system.
Santa Claus's second wife, Betty Moroz, plays online video games with her friends Williams and Gomez while making dinner on Christmas Eve, and is then disappointed when Santa gets a call from his secretary Ginny and goes out to take care of the children of the world, missing dinner.
- The typical shell nouns are "this guide..." or if it's clear it is only a section then "this section of a guide"...
- Since the summary is substitutive and not just indicative, it should attempt to contain the main steps or pieces of advice in the guide
This guide gives tips for ballet dancing.
This section of a ballet dancing guide explains some of ballet's history and popularity and gives practical advice on stretching before and after dancing, wearing properly fitted slippers or else socks without stickies on the bottom, wearing snug comfortable clothes such as a leotard and tights, and finding space to practice, ideally in a studio with a barre.
If the guide has a specific sub-genre, such as a recipe, it is also possible to directly label it as such in the summary, for example:
Three recipes for quinoa suggest optionally toasting it in olive oil in a pan and then cooking it in broth or water and salt on a stove, washing and cooking it in a rice cooker after optionally toasting it, or baking it in the oven in a baking dish covered with foil together with sauteed onions, peppers, mushrooms or other vegetables in a broth.
- If the approximate or exact date when the speech was held is mentioned, it is preferred to mention it in the summary, since the meaning of a speech is highly dependent on when it was given
- If the speaker making the speech is not identified within the speech itself (speaking only as 'I'), the preference for not mentioning information which is not spelled out in the text still holds (we prefer not to add this to the summary), but with several important exceptions and avoidance strategies
- If the speaker's affiliation or role is identifiable, we can mention that instead. For example, in the following UN speech from Albania, the name of the current Albanian representative is not essential, and we can replace them with 'Albania', which is clearly mentioned.
In a speech at the UN, Albania presents its work towards integration in NATO and adoption of European Union standards and collaborations ahead of general parliamentary elections the following year.
- Sometimes it is not reasonably possible to withhold the identity of the speaker. In such cases we do mention the speaker with the understanding that a reader would not be able to make sense of the speech without knowing the identity. The following example, which is from U.S. President Richard Nixon's resignation speech, does not actually mention Nixon by name, only as 'I', but the summary is forced to identify him:
US President Richard Nixon announces his decision to resign due to the Watergate affair, resulting in Vice President Ford being sworn in as President the following day.
A US President announces his decision to resign due to the Watergate affair, resulting in Vice President Ford being sworn in as President the following day.
- Typically a present tense third person style is used, and events are ordered in sequence, for example: "Ash tells about her day, which includes a yoga class, marketing brand management class, doing some work while having coffee at Saxby's, and finally cooking pasta with peppers for dinner together with her boyfriend Harry."
- As in conversatons, people other than the Vlogger who play a significant role in the vlog should be mentioned, but if their name is not mentioned within the excerpt being annotated, then they can only be referred to using generic terms ("a friend/relative/...")
- If the vlogger does not mention that they are a vlogger in the video, or that this is a vlog, do not refer to them as such (e.g. "Jasmine tells about...", not "YouTube vlogger Jasmine tells...")
Jasmine tells about how she tested positive for Covid on December 16th after she spent time without a mask with her sister, who also tested positive, and recounts her symptoms over several days, starting from a sore throat, then fever and congestion, and finally a partial loss of smell and taste and shortness of breath.
- A shell noun does not need to be used if the guide can be summarized in a pattern along the lines of "<PLACE> is a tourist destination featuring...". For example:
The Chatham Islands east of New Zealand (population 640), which are reachable by plane from New Zealand, feature the magestic Henga Scenic Reserve and beach, Hapupu National Historic Scenic Reserve, the picturesque Port Hutt and Kopinga Marae, meeting place of the indigenous Moriori people.
- However if the guide not only describes the place, this pattern will not work, since "<PLACE> is..." does not fit introducing information beyond the place itself, such as transportation information:
This guide to Isfahan, a beautiful ancient city in Iran, details its architectural sites such as the Safavid era Naqsh-e Jahan Square and Chaharbagh Boulevard, four stunning mosques and three palaces, and the Armenian Quarter (Jolfa), as well as transport information on airport and taxi connections.
Isfahan is a beautiful ancient city in Iran, with architectural sites such as the Safavid era Naqsh-e Jahan Square and Chaharbagh Boulevard, four stunning mosques and three palaces, and the Armenian Quarter (Jolfa), and transport information on airport and taxi connections is provided.
If we want to include the last part about transportation in the summary, then a shell noun such as "this guide to Isfahan" is the more plausible way to summarize the guide (at the cost of additional words and a shell noun 'guide' not mentioned in the text itself)
- List of common shell nouns for travel guides: (not exhaustive)