Need help in comparing the string words in two arrays.
Hello,
I have a text paragraph and a String[] of StopWords. Now I will have
to compare the each word of the paragraph with the StopWords array and
then if the word in paragraph doesn't match it returns false and that
word be pushed into a vector. So to compare each word of the text
paragraph I have put it in a String[] like this
String[] arrAbstractText = txtAbstract.split("\\ ");
txtAbstract is the below text paragraph.
******************************Here is the
txtAbstract****************************************
Abstract : A comparative transcriptome analysis for successive stages
of Arabidopsis developmental leaf senescence (NS),
darkening-induced senescence of individual leaves attached to the plant
(DIS) and senescence in dark-incubated detached
leaves (DET) revealed many novel senescence-associated genes with
distinct expression profiles. The three senescence
processes share a high number of regulated genes, although the overall
number of regulated genes during DIS and DET is about
two times lower than during NS. Consequently, the number of NS-specific
genes is much higher than of DIS- or DET-specific
genes. The expression profiles of transporters, receptor like kinases,
autophagy genes and hormone pathways were analysed in
detail. The Arabidopsis transporters and other integral membrane
proteins were systematically re-classified based on the
Transporter Classification System. Coordinate activation or
inactivation of several genes is observed in some transporter
families in all three or only in individual senescence types,
indicating differences in the genetic programs for
remobilization of catabolites. Characteristic senescence type-specific
differences were also apparent in the expression
profiles of (putative) signaling kinases. For eight hormones the
expression of biosynthesis, metabolism, signaling and
(partially) response genes was investigated. In most pathways novel
senescence-associated genes were identified. The
expression profiles of hormone homeostasis and signaling genes reveal
additional players in the senescence regulatory
network.
*****************************************************************************************************
After putting using split("\\ ") function the above pragraph becomes an
array and when I debug the value of "arrAbstractText" is looks like
below
***********************************************arrAbstractText
array*********************************
[A, comparative, transcriptome, analysis, for, successive, stages, of,
Arabidopsis, developmental, leaf, senescence, (NS),, darkening-induced,
senescence, of, individual, leaves, attached, to, the, plant, (DIS),
and, senescence, in, dark-incubated, detached, leaves, (DET), revealed,
many, novel, senescence-associated, genes, with, distinct, expression,
profiles., The, three, senescence, processes, share, a, high, number,
of, regulated, genes,, although, the, overall, number, of, regulated,
genes, during, DIS, and, DET, is, about, two, times, lower, than,
during, NS., Consequently,, the, number, of, NS-specific, genes, is,
much, higher, than, of, DIS-, or, DET-specific, genes., The,
expression, profiles, of, transporters,, receptor, like, kinases,,
autophagy, genes, and, hormone, pathways, were, analysed, in, detail.,
The, Arabidopsis, transporters, and, other, integral, membrane,
proteins, were, systematically, re-classified, based, on, the,
Transporter, Classification, System., Coordinate, activation, or,
inactivation, of, several, genes, is, observed, in, some, transporter,
families, in, all, three, or, only, in, individual, senescence, types,,
indicating, differences, in, the, genetic, programs, for,
remobilization, of, catabolites., Characteristic, senescence,
type-specific, differences, were, also, apparent, in, the, expression,
profiles, of, (putative), signaling, kinases., For, eight, hormones,
the, expression, of, biosynthesis,, metabolism,, signaling, and,
(partially), response, genes, was, investigated., In, most, pathways,
novel, senescence-associated, genes, were, identified., The,
expression, profiles, of, hormone, homeostasis, and, signaling, genes,
reveal, additional, players, in, the, senescence, regulatory, network.]
********************************************************************************************************
And also when the StopWords array looks like below when I debug the
code
**************************************************StopWords
array***********************************
[a, a's, able, about, above, according, accordingly, across, actually,
after, afterwards, again, against, ain't, all, allow, allows, almost,
alone, along, already, also, although, always, am, among, amongst, an,
and, another, any, anybody, anyhow, anyone, anything, anyway, anyways,
anywhere, apart, appear, appreciate, appropriate, Approximately, are,
aren't, around, as, aside, ask, asking, associated, at, available,
away, awfully, b, be, became, because, become, becomes, becoming, been,
before, beforehand, behind, being, believe, below, beside, besides,
best, better, between, beyond, both, brief, but, by, c, c'mon, c's,
came, can, can't, cannot, cant, cause, causes, certain, certainly,
changes, clearly, co, com, come, comes, concerning, conditions,,
consequently, consider, considering, contain, containing, contains,
corresponding, could, couldn't, course, currently, d, definitely,
described, despite, did, didn't, different, do, does, doesn't, doing,
don't, done, down, downwards, during, e, each, edu, eg, eight, either,
else, elsewhere, enough, entirely, especially, et, etc, even, ever,
every, everybody, everyone, everything, everywhere, ex, exactly,
example, except, f, far, few, fifth, first, five, followed, followin,
follows, for, former, formerly, forth, four, from, further,
furthermore, g, get, gets, getting, given, gives, go, goes, going,
gone, got, gotten, greetings, h, had, hadn't, happens, hardly, has,
hasn't, have, haven't, having, he, he's, hello, help, hence, her, here,
here's, hereafter, hereby, herein, hereupon, hers, herself, hi, him,
himself, his, hither, hopefully, how, howbeit, however, i, i'd, i'll,
i'm, i've, ie, if, ignored, immediate, in, inasmuch, inc, indeed,
indicate, indicated, indicates, inner, insofar, instead, into, inward,
is, isn't, it, it'd, it'll, it's, its, itself, j, just, k, keep, keeps,
kept, know, knows, known, l, last, lately, later, latter, latterly,
least, less, lest, let, let's, like, liked, likely, little, look,
looking, looks, ltd, m, mainly, many, may, maybe, me, mean, meanwhile,
merely, might, more, moreover, most, mostly, much, must, my, myself, n,
name, namely, nd, near, nearly, necessary, need, needs, neither, never,
nevertheless, new, next, nine, no, nobody, non, none, noone, nor,
normally, not, nothing, novel, now, nowhere, o, obviously, of, off,
often, oh, ok, okay, old, on, once, one, ones, only, onto, or, other,
others, otherwise, ought, our, ours, ourselves, out, outside, over,
overall, own, p, particular, particularly, per, perhaps, placed,
please, plus, possess, possible, presumably, probably, provides, q,
que, quite, qv, r, rather, rd, re, really, reasonably, regarding,
regardless, regards, relatively, respectively, right, s, said, same,
saw, say, saying, says, second, secondly, see, seeing, seem, seemed,
seeming, seems, seen, self, selves, sensible, sent, serious, seriously,
seven, several, shall, she, should, shouldn't, since, six, so, some,
somebody, somehow, someone, something, sometime, sometimes, somewhat,
somewhere, soon, sorry, specified, specify, specifying, still, sub,
such, sup, sure, t, t's, take, taken, tell, tends, th, than, thank,
thanks, thanx, that, that's, thats, the, The, their, theirs, them,
themselves, then, thence, there, there's, thereafter, thereby,
therefore, therein, theres, thereupon, these, they, they'd, they'll,
they're, they've, think, third, this, thorough, thoroughly, those,
though, three, through, throughout, thru, thus, to, together, too,
took, toward, towards, tried, tries, truly, try, trying, twice, two, u,
un, under, unfortunately, unless, unlikely, until, unto, up, upon, us,
use, used, useful, uses, using, usually, uucp, v, value, various, very,
via, viz, vs, w, want, wants, was, wasn't, way, we, we'd, we'll, we're,
we've, welcome, well, went, were, weren't, what, what's, whatever,
when, whence, whenever, where, where's, whereafter, whereas, whereby,
wherein, whereupon, wherever, whether, which, while, whither, who,
who's, whoever, whole, whom, whose, why, will, willing, wish, with,
within, without, won't, wonder, would, would, wouldn't, x, y, yes, yet,
you, you'd, you'll, you're, you've, your, yours, yourself, yourselves,
z, zero, -, %, !, @, #, $, ^, &, *, (, ), +, =, ,, ., /, ?, <, >, ~, `]
*********************************************************************************************************
and the code I wrote to compare and filter the words is
*********************************************************************************************************
String[] arrAbstractText = txtAbstract.split("\\ ");
boolean match = false;
for (int k = 0; k < arrAbstractText.length; k++) {
for (int l = 0; l < stopWords.length; l++) {
s = String.valueOf(arrAbstractText[k]).trim();
if(s.length()>0 &&
stopWords[l].trim().equals(s.toLowerCase())){match=true;}
}
if (!match) {
vFWords.add(arrAbstractText[k].toString().toLowerCase());
System.out.println("Words do not match :" +
arrAbstractText[k].toLowerCase().trim());
}
}
*********************************************************************************************************
I am not sure if I am doing it right in the above code snippet but I
don't missing lot of words while comparing the text
Here is a set of words that it suppose to return
"Arabidopsis"
"senescence"
"proteins" and many words like this.
Could some one please help me with this? Its higly appreciated as I am
close to the dead line to my project.
thanks
-L