by IA, GH, MM Group MIG: Metadata is Great
By the time of his presidential appointment, Henry Morgenthau, Jr. had studied agriculture at Cornell University, served as Chairman of the New York State Agricultural Advisory Commission, and served as State Conservation Commissioner. In 1933, Morgenthau became the Chairman of the Federal Farm Bureau and Farm Credit Administration under President Roosevelt. Morgenthau was confirmed as Secretary of the Treasury in January 1934 (FDR Library, 2022). Only Morgenthau’s service as Acting Secretary of the Treasury appears in the dataset we examined, that is, the first 77 pages of Volume 1 of Morgenthau’s Press Conferences (in Series 2) which document November 15, 1933 through the end of 1933.
In 1933, America was starting to hope that the worst of the Great Depression had passed thanks to FDR’s New Deal initiatives (Library of Congress, 2022). Another notable event later that year and of concern to a Treasury Secretary is the ratification of the Twenty-first Amendment, which repealed Prohibition (Westport Library, 2022). Some of Morgenthau’s later accomplishments in the Treasury Office would include contributing to the New Deal, making the Lend-Lease program to aid Britain during World War II, helping with war bonds on the homefront, starting the War Refugee Board, and offering a vision of post-war Germany with the “Morgenthau Plan” (FDR Library, 2022).
We decided to analyze Morgenthau’s interactions with the press in 1933. In the 1930s, “the press was divided – that is, owners and publishers were in general opposed to the New Deal and not inclined to rise to the challenge of the times, whereas the working press was sympathetic and did respond” (McWilliams, 1970, p. 10). Key publications during this time were the New Yorker, The Robber Barons, The Politicos, The President Makers, the Nation, and the New Republic. In addition to traditional print publications, books, radio, documentaries, photographs, and pamphlets arose as avenues for more investigative and in-depth reporting (McWilliams, 1970, p. 10). The 1930s also saw the start of the “Great Story” in journalism where stories are longer, more in-depth, have a clear narrative, provide context, and don’t use jargon (Starkman, 2014).
Business and financial journalism specifically was transforming in this era “from its narrow messaging function into a profession capable of explaining complex problems to a mass audience” (Starkman, 2014). It was heavily reliant on “access reporting” or reporting which is reliant on having powerful, insider sources (Starkman, 2014). Such reliance on a scoop from an elite insider make the press’s interaction with Morgenthau an interesting subject for analysis.
The metrics we looked at were the questions and answers provided in the press conferences. Our findings show that the majority of the questions asked were either open-ended questions or very specific questions. The answers given were in the majority: partial answers.
The dataset that we have is vast, and consists entirely of context-specific dialogue. Its current composition can be hard to compute without being in numerical or coded order, so we aim to translate the transcripts as data. Our objectives are to create numerical, visual representations of the data through data manipulation, format the data in a way that is succinct and readable, and clearly explain our narrative contribution (to journalism studies) through data-work.
As a group, we had several Zoom meetings reviewing the data and planning the project. From a manual read-through of the dataset, we decided to focus on leaks, rumors, and interactions with the press as demonstrated in the data. We wanted to see how Morgenthau’s relationship with the press changed over time and how that compares to today’s political media landscape. As none of us had used OpenRefine before, we decided to stick to a relatively small dataset: the first 77 pages of Volume 1 of Morgenthau’s press conferences collection, or all the press conferences through 1933.
The following details the software and tools that we were both successful and unsuccessful in implementing in the duration of this project and are presented in roughly chronological order:
While the use of OpenRefine would certainly make manipulating the data much easier, possibly gleaning even more enlightening results and insights to our analysis, it was simply not possible due to the technological and financial constraints in which we worked. (MM)
The biggest obstacle was converting the scanned PDFs into data that could be uploaded to OpenRefine and usable there. We used a free trial from Adobe to convert the PDF to an Excel file. However, the output it produced in OpenRefine was almost completely illegible. We determined that the problem was the PDF was too low quality and/or scanned as image-only, so its text wasn’t automatically recognized via optical character recognition (OCR). Although some classmates had success with using Photoshop, since that wasn’t a free resource, we looked elsewhere and found a website called www.ocr2edit.com which worked fairly well in converting the PDF into an OCR-readable PDF file. Our attempt to convert that to Excel through Adobe did not succeed. However, we could now manually search for words (doing so for “leak,” “no answer,” and related contentious language) and analyze their instances and context. (IA)
We completed three manual manipulations and tallies of the data set: what kinds of questions were posed to Morgenthau, what types of questions referenced leaked or sensitive information, and how completely Morgenthau answered questions.
For one manipulation, we analyzed the 230 questions reporters asked over the 11 press conferences Morgenthau held from November 15, 1933 through December 29, 1933. The questions were sorted into two main categories: initial questions (which opened a new topic), or follow-up questions (which directly followed up on another question or revisited a topic brought up earlier in the same press conference). We used the Revolution Learning and Development company’s (2022) eight different types of questions to make subcategories and create the following definitions:
Reflective, leading, hypothetical, and rhetorical questions could be either initial or follow-up questions, however closed and open questions were only initial questions. Specific and probing were almost always follow-up questions per their definitions (Revolution, 2022), with three rare exceptions where Morgenthau, rather than a reporter, posed a new topic to which a reporter asked an initial follow-up question in those two categories. (MM)
Figure 1 explains what kind of questions were posed to Morgenthau, Figure 2 looks at the questions that entail leaked or sensitive information, and Figure 3 explains the types of answers Morgenthau provided. In summary roughly 231 questions were asked over 11 press conferences. Nearly a quarter of the questions (24%) were of a contentious nature, detailing leaks, rumors, and other negative reports while 30% of questions remained unanswered by Morgenthau.
Across the 11 press conferences, there was an average of 20.91 questions asked per conference, with 7.72 of them initial questions and 13.18 of them follow-up questions as seen in Figure 1. There was an average of 2.31 follow-up questions asked after an initial question.
Figure 1. A stacked bar graph showing the average types of questions asked per press conference.
Our analysis of the data also noted how many questions explicitly, or implicitly (to the best of our historical knowledge), referenced leaks, rumors, or other contentious press reports (55, or 24% or nearly a quarter of the questions dataset). Of the about 21 total questions per press conference, five of them on average dealt with leaks, rumors, or contentious reports. Of these five averaged questions re: rumors, about two were initial questions and three were follow-up question types as depicted in Figure 2.
Figure 2. A stacked bar graph showing the types of questions asked related to leaks, rumors, or contested reports in the media.
Figure 3 shows that Morgenthau completely answered 32% of the questions, partially answered 48%, and disregarded or did not answer 34% of the questions between pages 1 to 77 of the bound volume.
Figure 3. A bar graph showing how fully Morgenthau answered questions during press conferences.
We note the similar percentages of questions Morgenthau refused to answer (34%) and those posed by the press with the aforementioned negative connotations (24%). Without over-romanticizing, their verbal chess matches suggest that they did hold a mutual respect, each party cooperating about a quarter of the time and as much as the other.
While it is difficult to ascertain emotions through text alone, certain phraseology used during the press conferences suggests at least some cordiality between Morgenthau and the press. During one exchange, Morgenthau begins with, “All right, gentlemen, what is on your mind?” The press, in turn, asks him the same question, hinting that their exchanges could be playful and familiar at times. (MM)
An earlier period of time between the 1890s to the 1920s saw a rise in journalism tactics that aimed critique at notable figures and industries in society. This form of journalism, known as muckraking, gained traction due to an increase in high-school educated members of society, and it also was primarily popularized during times of unrest caused by WWI.
The press conferences that our story analyzes take place in a time where the muckraking tactic is said to lose traction within mainstream journalism. However, muckraking forever altered the climate of journalism (McWilliams, 1970; Getchell, 2022). It changed the interviewer/interviewee relationship and caused more apprehension overall. Journalists are eager to gain valuable, exclusive information that could be leaked, held as a bargaining chip, or used to further an inflammatory narrative. Cautious public figures are careful to avoid these pitfall questions so as not to reveal sensitive information, cause negative spin, or unveil unflattering information that the institutions they represent try to hide.
This is where we find the center of our ethical considerations in regard to our dataset. Our group strives to inspect how the idea of journalism exploiting controversial, narrative-pushing information has shaped the ways journalists and their interviewees interact. We asked the data what kinds of questions Morgenthau answered in full, what were given answers that diverted or only disclosed partial information, and what questions were disregarded or only answered with a statement of refusal to answer. Such data-work does not tell us what went through Morgenthau’s mind when each interview question was posed, but we can incline his approach by considering how questions was regarded and answered. Nor can every unanswered or negatively-regarded question be because of potentially leaking sensitive information. We can gain some understanding of what Morgenthau was privy to, could and could not speak on, and did or did not know or have access to (categories of information access).
Regarding user needs, without manipulation the dataset is unsearchable, making searching by patrons almost inoperable in most modern search functionalities (Riva et. al, 2017). OCR improvements would make it content-searchable rather than keyword- or numerically-accessible. Atomizing this data to a searchable document allows for keyword searchability and metadata harvesting, and future record creation. As well, constructing numerical data for this set allows for other succinct, logical explanations of the content within. Overall, the manipulations described increase accessibility for the agency’s users. (GH)
With more time and monetary resources, we would use Photoshop to clean up the PDF images to be more readable in an OCR conversion. More formalized training in OpenRefine, should the datasets actually upload in a usable way, would be essential for further work. With access to the original files and with different quality-settings, it might be worth re-scanning some. Collaborating with a journalism department, journalism historians, or journalism librarians would also be worthwhile next steps to ensure correct interpretations of the data. Cross-referencing this and subsequent volumes of the data with newspaper archives from those times – to see what was published of the conferences – would be exciting. It would be interesting to see how Morgenthau’s interactions with the press evolved throughout his tenure as Treasury Secretary and the changing political and social landscape of the 1930s and 40s, and to conduct a comparison with recent Cabinet secretary and press relationships in modern-day political/financial journalism. (IA)