`
jojo_java
  • 浏览: 93612 次
  • 性别: Icon_minigender_1
  • 来自: 南京
社区版块
存档分类
最新评论

Clear Special character

    博客分类:
  • JAVA
 
阅读更多

 

	public static String[] analyzer(String string) {
		List<String> list = new ArrayList<String>();
		try {
			StringReader reader = new StringReader(string);
			IKSegmenter ik = new IKSegmenter(reader, true);
			Lexeme lexeme = null;
			while ((lexeme = ik.next()) != null) {
				list.add(lexeme.getLexemeText());
			}
		} catch (IOException e) {
			e.printStackTrace();
		}
		return list.toArray(new String[list.size()]);
	}

	public static String[] generate(String string) {
		List<String> list = new ArrayList<String>();
		string = clear_special_character(string);
		String[] tags = string.split("[,\\s]");
		for (String tag : tags) {
			tag = tag.trim();
			if (tag.length() > 0) {
				list.add(tag);
			}
		}
		return list.toArray(new String[list.size()]);
	}

	public static String clear_special_character(String string) {
		string = string.replaceAll("\\pP|\\pS", " ");
		string = string.replaceAll("\\s+", " ");
		return string;
	}
	
分享到:
评论

相关推荐

    Microsoft Library MSDN4DOS.zip

    3.6 String and Character Translation Instructions 3.7 Instructions for Block-Structured Languages 3.8 Flag Control Instructions 3.9 Coprocessor Interface Instructions 3.10 Segment Register ...

    一个java正则表达式工具类源代码.zip(内含Regexp.java文件)

    * \a The alert (bell) character ('\u0007') \a The alert (bell) character ('\u0007') * \e The escape character ('\u001B') \e esc符号 ('\u001B') * \cx The control character ...

    rfc全部文档离线下载rfc1-rfc8505

    The link field is a special device used by the IMPs to limit certain kinds of congestion. They function as follows. Between every pair of HOSTs there are 32 logical full-duplex connections over ...

    2009 达内Unix学习笔记

    clear 清屏,清除(之前的内容并未删除,只是没看到,拉回上面可以看回)。 五、目录管理命令 pwd 显示当前所在目录,打印当前目录的绝对路径。 cd 进入某目录,DOS内部命令 显示或改变当前目录。 cd回车/cd ~ 都...

    端口查看工具

    contained comma character. * Version 1.34: o New Option: Remember Last Filter (The filter is saved in cports_filter.txt) * Version 1.33: o Added support for saving comma-delimited (.csv) files. ...

    UE(官方下载)

    UltraEdit includes several special insert functions under the Insert menu. You can use these functions to insert a file into the active file, insert a string into the file at every specified increment...

    netWindows_0.3.0_pre2

    WHETHER TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE, SHALL THELICENSOR BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL,INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER ARISING ...

    JSP Simple Examples

    A directive is a way to give special instructions to the container at page translation time. The page directive is written on the top of the jsp page. Html tags in jsp In this example we have used ...

    Turbo C++ 3.0[DISK]

    special offer, and write for full details on how to receive a free IntroPak containing a $15 credit toward your first month's on-line charges. 2. Check with your local software dealer or users' ...

    Turbo C++ 3.00[DISK]

    special offer, and write for full details on how to receive a free IntroPak containing a $15 credit toward your first month's on-line charges. 2. Check with your local software dealer or users' ...

    VclZip pro v3.10.1

    Special OnGetNextTStream Event for Delphi 4,5, BCB 4, and 5 - Allows zipping multiple TStreams in one process - More efficient than calling ZipFromStream multiple times Capability to use the latest ...

    DebuggingWithGDB 6.8-2008

    Table of Contents Summary of gdb . . . . . . . . ....Free Software ....Free Software Needs Free Documentation ....Contributors to gdb....1 A Sample gdb Session ....2 Getting In and Out of gdb ....2.1 Invoking gdb ....

Global site tag (gtag.js) - Google Analytics