PDA

Split by, split on

View Full Version : Split by, split on


Tyrn
March 15, 2024, 02:07 PM
Hi,

This is a distinction I'm not at all sure of. The following is my guess, not a certainty.

Here's a string of words:

"alfa bravo charlie delta"

This is the string split:

("alfa" "bravo" "charlie" "delta")

Is it correct to say that the string is split by words, on spaces?

wrholt
March 15, 2024, 05:54 PM
I would say that I split the string on spaces (or at spaces)* in order to split it into words, but only when talking about strings that are similar to your example string, which contains only arbitrary groups of alphabetic characters separated by one and only one space character.

If I don't know whether a string is guaranteed to satisfy that restriction, I would say that I split the string on spaces in order to split it into substrings.

For example, if I split the previous sentence on spaces, some of the substrings are "substrings." and "restriction,", which are not words in a strict sense, as the punctuation marks "." and "," are not part of the spelling of words in English.

Disclaimer: by education I'm a computer scientist, and I've worked as a computer programmer for more than 36 years.

*for reference, the documentation from Microsoft about the Split function in the .Net framework uses "at spaces" rather than "on spaces": https://learn.microsoft.com/en-us/dotnet/api/system.string.split?view=net-8.0&redirectedfrom=MSDN#overloads

Tyrn
March 15, 2024, 11:53 PM
Yes, I meant computer science all along :o

So, we split strings on/at spaces, and into substrings. Still unclear: is it possible to split a string by words? I'm really interested in a short preposition.

wrholt
March 16, 2024, 09:28 PM
Yes, I meant computer science all along :o

So, we split strings on/at spaces, and into substrings. Still unclear: is it possible to split a string by words? I'm really interested in a short preposition.

"Split a string by words" doesn't sound as natural to me as "split a string into words".

"Split a string into words" describes the end result of splitting a string. "Split a string by words" suggests the method of splitting a string into smaller units, rather than identifying what the smaller units are. In this context, "by words" doesn't fit the context well, while "into words" does.

There are other types of data for which I could use "by" to describe how to group the data into smaller sets: for example, an economic analysis could report economic data grouped by some unit of time such as year, month, quarter, or week. Or it could report the data by cateory: medical spending, basic necessities such as food, clothing, or housing, and so on.

Note that I'm a native speaker of US English from the northeastern US. I cannot speak for other national, regional, or social varieties of English: there are differences of pronoun usage between different varieties of English.

Tyrn
March 17, 2024, 12:04 AM
Possible albeit debatable. One can settle for it, then. Word by Word: The Secret Life of Dictionaries by Kory Stamper is a book I can recommend for everyone :D .

wrholt
March 19, 2024, 03:07 PM
...

Note that I'm a native speaker of US English from the northeastern US. I cannot speak for other national, regional, or social varieties of English: there are differences of pronoun usage between different varieties of English.

My proofreading wasn't that great when I stopped working on a previous reply: I meant to say "differences of preposition usage". That's what I get for writing responses after staying up longer than I probably should. :rolleyes: