StringArray::addTokens() bug!

Discussion and support for general JUCE issues

StringArray::addTokens() bug!

Postby Randy » Fri Mar 03, 2006 12:55 am

Look at these examples:

Code: Select all
StringArray tokens;

tokens.addTokens(T("two words"), false);
// tokens.size() is 2 (correct!)

tokens.clear();
tokens.addTokens(T("five words with trailing space "), false);
// tokens.size() is 6 (incorrect!)

tokens.clear();
tokens.addTokens(T("six words with two trailing spaces  "), false);
// tokens.size() is 8 (incorrect!)

tokens.clear();
tokens.addTokens(T(" five words with leading space"), false);
// tokens.size() is 6 (incorrect!)

tokens.clear();
tokens.addTokens(T("seven words with three   spaces in middle"), false);
// tokens.size() is 9 (incorrect!)


I don't think we want leading, trailing, and multiple spaces included when tokenizing, do we?

It's easily handled with a call to StringArray::removeEmptyStrings(true) when tokenizing whitespace, but not as easily remedied when tokenizing with other break characters.
Randy
JUCE Obsessive
 
Posts: 51
Joined: Tue Jun 21, 2005 9:28 pm
Location: Canada

Postby OvermindDL1 » Fri Mar 03, 2006 1:39 am

I'd say just use spirit to parse it, would be worlds easier.

Actually, spirit is probobly overkill, boost:tokanizer is made for *exactly* that kind of thing though, and its been through its paces years ago so there are no known bugs. And it would let you parse tokens based on whitespace or any other seperator. Would work quite well inside addTokens() in the StringArray class.
Image
OvermindDL1
JUCE UberWeenie
 
Posts: 328
Joined: Fri Jun 03, 2005 11:58 am

Re: StringArray::addTokens() bug!

Postby jules » Fri Mar 03, 2006 9:55 am

Randy wrote:Look at these examples:

Code: Select all
StringArray tokens;

tokens.addTokens(T("two words"), false);
// tokens.size() is 2 (correct!)

tokens.clear();
tokens.addTokens(T("five words with trailing space "), false);
// tokens.size() is 6 (incorrect!)

tokens.clear();
tokens.addTokens(T("six words with two trailing spaces  "), false);
// tokens.size() is 8 (incorrect!)

tokens.clear();
tokens.addTokens(T(" five words with leading space"), false);
// tokens.size() is 6 (incorrect!)

tokens.clear();
tokens.addTokens(T("seven words with three   spaces in middle"), false);
// tokens.size() is 9 (incorrect!)


I don't think we want leading, trailing, and multiple spaces included when tokenizing, do we?

It's easily handled with a call to StringArray::removeEmptyStrings(true) when tokenizing whitespace, but not as easily remedied when tokenizing with other break characters.


that's usually true for whitespace, but if I was tokenising with other separators, e.g.

"a,b,,c,,"

then I'd be quite annoyed if it didn't return 6 items, three of them empty.. And of course removeEmptyStrings would clean this up too, if you just want the non-empty tokens.
User avatar
jules
Fearless Leader
 
Posts: 17189
Joined: Mon Sep 06, 2004 9:03 am
Location: London, UK

Re: StringArray::addTokens() bug!

Postby Randy » Fri Mar 03, 2006 7:03 pm

jules wrote:that's usually true for whitespace, but if I was tokenising with other separators, e.g.

"a,b,,c,,"

then I'd be quite annoyed if it didn't return 6 items, three of them empty.. And of course removeEmptyStrings would clean this up too, if you just want the non-empty tokens.


A very good point I'd completely overlooked as I'm only dealing with whitespace! I've just thrown in a bunch of removeEmptyStrings() calls into my parsing routines to clean it up.
Randy
JUCE Obsessive
 
Posts: 51
Joined: Tue Jun 21, 2005 9:28 pm
Location: Canada


Return to General JUCE discussion

Who is online

Users browsing this forum: Google Feedfetcher and 2 guests