diff --git a/README.md b/README.md index e4df90e..4f67b30 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,9 @@ # cjvt-srl-tagging -A framework for using mate-tagger to tag Kres. +We'll be using mate-tools to perform SRL on Kres. + +## mate-tools +Using **Full srl pipeline (including anna-3.3)** from the Downloads section. + +## Sources +[1] (mate-tools) https://code.google.com/archive/p/mate-tools/ + diff --git a/dockerfiles/mate-tool-env/README.md b/dockerfiles/mate-tool-env/README.md new file mode 100644 index 0000000..f038722 --- /dev/null +++ b/dockerfiles/mate-tool-env/README.md @@ -0,0 +1 @@ +We'll need java. diff --git a/tools/srl-20131216/LICENSE b/tools/srl-20131216/LICENSE new file mode 100644 index 0000000..d159169 --- /dev/null +++ b/tools/srl-20131216/LICENSE @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/tools/srl-20131216/featuresets/chi/ac.feats b/tools/srl-20131216/featuresets/chi/ac.feats new file mode 100644 index 0000000..b83e526 --- /dev/null +++ b/tools/srl-20131216/featuresets/chi/ac.feats @@ -0,0 +1,24 @@ + +ArgDeprel - size: 30 # 70.973 +ArgWord - size: 15967 # 76.461 +PredLemmaSense - size: 8897 # 78.738 +DeprelPath - size: 4503 # 80.958 +Position - size: 3 # 82.420 +ArgPOS - size: 32 # 83.357 +RightWord - size: 5342 # 83.797 +ChildDepSet - size: 25 # 84.146 +RightPOS - size: 29 # 84.401 +PredLemma - size: 8436 # 84.713 +LeftSiblingPOS - size: 31 # 84.893 +ChildPOSSet - size: 33 # 85.126 +LeftSiblingWord - size: 7568 # 85.238 +PredWord - size: 8436 # 85.269 +LeftWord - size: 7290 # 85.354 +RightSiblingWord - size: 9315 # 85.396 +PredParentWord - size: 4009 # 85.428 +LeftPOS - size: 29 # 85.471 +PredPOS - size: 26 # 85.497 +Position+PredLemmaSense - size: 13073 # 86.466 +ArgPOS+PredLemma - size: 16846 +RightPOS+RightSiblingPOS - size: 367 +ArgWord+Position - size: 13582 \ No newline at end of file diff --git a/tools/srl-20131216/featuresets/chi/ai.feats b/tools/srl-20131216/featuresets/chi/ai.feats new file mode 100644 index 0000000..2ce92bc --- /dev/null +++ b/tools/srl-20131216/featuresets/chi/ai.feats @@ -0,0 +1,16 @@ + +POSPath - size: 5556 # 59.624 +ArgDeprel - size: 30 # 72.666 +DeprelPath - size: 4459 # 74.123 +PredLemmaSense - size: 8838 # 74.711 +ArgWord - size: 16026 # 75.023 +ChildPOSSet - size: 33 # 75.246 +ArgPOS - size: 31 # 75.401 +DepSubCat - size: 8310 # 75.527 +RightSiblingWord - size: 9459 # 75.705 +ChildDepSet+POSPath - size: 14414 # 76.044 +DeprelPath+LeftSiblingPOS - size: 8188 # 76.454 +ArgDeprel+RightPOS - size: 291 +DeprelPath+Position - size: 3130 +ArgDeprel+Position - size: 61 +ArgPOS+PredLemmaSense - size: 17873 # 77.56 \ No newline at end of file diff --git a/tools/srl-20131216/featuresets/chi/pd.feats b/tools/srl-20131216/featuresets/chi/pd.feats new file mode 100644 index 0000000..1b4a65d --- /dev/null +++ b/tools/srl-20131216/featuresets/chi/pd.feats @@ -0,0 +1,16 @@ + +ChildWordSet - size: 19416 # 96.485 +PredParentWord - size: 4772 # 97.881 +DepSubCat - size: 10376 # 98.183 +PredDeprel - size: 19 # 98.265 +ChildPOSSet - size: 34 # 98.296 +PredParentPOS - size: 31 # 98.328 +ChildDepSet - size: 24 # 98.358 +ChildWordSet+DepSubCat - size: 124898 # 98.719 +DepSubCat+PredParentWord - size: 25326 # 98.847 +ChildWordSet+PredParentWord - size: 62401 # 98.930 +ChildWordSet+PredParentPOS - size: 41068 # 98.978 +PredParentPOS+PredParentWord - size: 5317 # 99.014 +PredPOS+PredParentWord - size: 7509 # 99.035 +ChildWordSet+PredDeprel - size: 35242 # 99.053 +PredDeprel+PredParentWord - size: 7766 # 99.071 diff --git a/tools/srl-20131216/featuresets/chi/pi.feats b/tools/srl-20131216/featuresets/chi/pi.feats new file mode 100644 index 0000000..2d1ef54 --- /dev/null +++ b/tools/srl-20131216/featuresets/chi/pi.feats @@ -0,0 +1,18 @@ + +PredWord +PredLemma +PredParentWord +PredParentPOS +ChildDepSet +ChildWordSet +ChildWordSet+ChildDepSet +ChildPOSSet +ChildPOSSet+ChildDepSet +DepSubCat +PredDeprel +## +## NOTE: These features are just copied from the English dir. +## A feature selection should be run to fix this. +## This is an initial draft just to execute the system. +## +## \ No newline at end of file diff --git a/tools/srl-20131216/featuresets/eng/ac.feats b/tools/srl-20131216/featuresets/eng/ac.feats new file mode 100644 index 0000000..48be273 --- /dev/null +++ b/tools/srl-20131216/featuresets/eng/ac.feats @@ -0,0 +1,48 @@ +N +ArgWord - size: 11313 # 26.203 +PredWord - size: 7884 # 29.199 +ArgPOS - size: 39 # 29.937 +PredLemmaSense - size: 5440 # 30.351 +RightWord - size: 7840 # 30.774 +Position - size: 3 # 31.017 +LeftSiblingWord - size: 6679 # 31.131 +LeftWord - size: 5408 # 31.212 +PredLemma - size: 4617 # 31.269 +ChildPOSSet - size: 45 # 31.317 +RightPOS - size: 37 # 31.391 +LeftSiblingPOS - size: 45 # 31.423 +ArgWord+PredLemmaSense - size: 68929 # 32.584 +ArgPOS+PredLemmaSense - size: 23739 # 33.006 +POSPath+PredLemmaSense - size: 35339 # 33.274 +ArgWord+Position - size: 15700 # 33.444 +Position+PredLemmaSense - size: 10385 # 33.550 +PredLemmaSense+RightSiblingPOS - size: 25180 # 33.607 +LeftPOS+RightSiblingPOS - size: 585 # 33.672 +ArgPOS+Position - size: 85 # 33.688 +LeftSiblingPOS+PredLemmaSense - size: 23325 # 33.728 + +V +ArgWord - size: 11130 # 37.124 +ArgDeprel - size: 49 # 42.629 +PredLemmaSense - size: 5407 # 45.975 +ChildDepSet - size: 34 # 47.582 +DeprelPath - size: 6106 # 48.335 +RightWord - size: 7790 # 48.878 +ArgPOS - size: 40 # 49.367 +Position - size: 3 # 49.709 +PredPOS - size: 24 # 49.950 +LeftSiblingPOS - size: 45 # 50.151 +PredLemma - size: 4584 # 50.338 +LeftPOS - size: 41 # 50.392 +RightPOS - size: 40 # 50.509 +PredParentWord - size: 5272 # 50.602 +POSPath - size: 6553 # 50.679 +PredParentPOS - size: 35 # 50.710 +ArgWord+PredLemmaSense - size: 67759 # 51.331 +ArgDeprel+PredLemmaSense - size: 15668 # 51.443 +ChildDepSet+Position - size: 88 # 51.656 +ArgDeprel+RightPOS - size: 504 # 51.769 +Position+PredLemmaSense - size: 7868 # 51.906 +ArgPOS+ArgWord - size: 8273 # 52.073 +ArgPOS+PredLemma - size: 15080 # 52.210 +ArgDeprel+ArgPOS - size: 397 # 52.301 diff --git a/tools/srl-20131216/featuresets/eng/ai.feats b/tools/srl-20131216/featuresets/eng/ai.feats new file mode 100644 index 0000000..5a1a69f --- /dev/null +++ b/tools/srl-20131216/featuresets/eng/ai.feats @@ -0,0 +1,36 @@ +N +POSPath - size: 6735 # 31.748 +ArgWord - size: 11155 # 34.751 +Position - size: 3 # 39.713 +ArgPOS - size: 42 # 40.253 +DeprelPath - size: 6244 # 40.593 +PredLemma - size: 4600 # 41.474 +ChildWordSet - size: 11011 # 41.967 +RightPOS - size: 37 # 42.130 +PredPOS - size: 22 # 42.218 +RightWord - size: 7676 # 42.295 +ArgWord+Position - size: 15542 # 43.001 +Position+PredLemmaSense - size: 10315 # 43.531 +ArgPOS+PredLemmaSense - size: 23561 # 43.826 +DeprelPath+Position - size: 6742 # 44.075 +ArgDeprel+PredParentWord - size: 16768 # 44.424 +LeftPOS+PredWord - size: 23706 # 44.574 +ChildWordSet+PredLemmaSense - size: 59280 # 44.745 + +V +DeprelPath - size: 6201 # 53.649 +ArgPOS - size: 40 # 63.119 +ArgDeprel - size: 47 # 63.423 +POSPath - size: 6626 # 63.887 +RightSiblingWord - size: 5887 # 63.985 +ArgWord - size: 11102 # 64.090 +PredParentWord +PredParentPOS +Position +PredLemmaSense +ArgDeprel+DeprelPath - size: 9262 # 64.275 +POSPath+RightSiblingPOS - size: 10086 # 64.575 +ArgDeprel+ChildDepSet - size: 812 # 64.69 +ArgDeprel+ArgPOS - size: 450 # 64.838 +LeftPOS+RightPOS - size: 584 # 64.957 +ArgDeprel+PredDeprel - size: 580 # 64.988 diff --git a/tools/srl-20131216/featuresets/eng/pd.feats b/tools/srl-20131216/featuresets/eng/pd.feats new file mode 100644 index 0000000..0bdf1b2 --- /dev/null +++ b/tools/srl-20131216/featuresets/eng/pd.feats @@ -0,0 +1,33 @@ +N +ChildWordSet - size: 12813 # 96.565 +PredParentWord - size: 6437 # 98.151 +DepSubCat - size: 4390 # 98.611 +ChildPOSSet - size: 45 # 98.821 +PredParentPOS - size: 34 # 99.008 +PredPOS - size: 25 # 99.068 +PredDeprel - size: 34 # 99.123 +ChildDepSet - size: 35 # 99.173 +ChildWordSet+PredParentWord - size: 63125 # 99.451 +DepSubCat+PredParentWord - size: 22318 # 99.560 +ChildWordSet+PredParentPOS - size: 32279 # 99.651 +PredPOS+PredParentWord - size: 10783 # 99.681 +ChildWordSet+PredPOS - size: 24857 # 99.706 +ChildWordSet+ChildDepSet +ChildPOSSet+ChildDepSet + +V +ChildWordSet - size: 12813 # 96.565 +PredParentWord - size: 6437 # 98.151 +DepSubCat - size: 4390 # 98.611 +ChildPOSSet - size: 45 # 98.821 +PredParentPOS - size: 34 # 99.008 +PredPOS - size: 25 # 99.068 +PredDeprel - size: 34 # 99.123 +ChildDepSet - size: 35 # 99.173 +ChildWordSet+PredParentWord - size: 63125 # 99.451 +DepSubCat+PredParentWord - size: 22318 # 99.560 +ChildWordSet+PredParentPOS - size: 32279 # 99.651 +PredPOS+PredParentWord - size: 10783 # 99.681 +ChildWordSet+PredPOS - size: 24857 # 99.706 +ChildWordSet+ChildDepSet +ChildPOSSet+ChildDepSet diff --git a/tools/srl-20131216/featuresets/eng/pi.feats b/tools/srl-20131216/featuresets/eng/pi.feats new file mode 100644 index 0000000..973aff9 --- /dev/null +++ b/tools/srl-20131216/featuresets/eng/pi.feats @@ -0,0 +1,25 @@ +N +PredWord +PredLemma +PredParentWord +PredParentPOS +ChildDepSet +ChildWordSet +ChildWordSet+ChildDepSet +ChildPOSSet +ChildPOSSet+ChildDepSet +DepSubCat +PredDeprel + +V +PredWord +PredLemma +PredParentWord +PredParentPOS +ChildDepSet +ChildWordSet +ChildWordSet+ChildDepSet +ChildPOSSet +ChildPOSSet+ChildDepSet +DepSubCat +PredDeprel diff --git a/tools/srl-20131216/featuresets/ger/ac.feats b/tools/srl-20131216/featuresets/ger/ac.feats new file mode 100644 index 0000000..ca130fd --- /dev/null +++ b/tools/srl-20131216/featuresets/ger/ac.feats @@ -0,0 +1,29 @@ +V +ArgDeprel - size: 36 # 63.517 +PredLemmaSense - size: 1174 # 71.699 +ArgWord - size: 8237 # 74.273 +RightSiblingWord - size: 3662 # 75.997 +ChildDepSet - size: 34 # 76.703 +DeprelPath - size: 865 # 77.201 +PredLemma - size: 563 # 77.741 +POSPath - size: 982 # 78.094 +LeftSiblingFeats - size: 24 # 78.468 +LeftFeats - size: 22 # 78.738 +PredFeats - size: 13 # 78.904 +ArgFeats - size: 24 # 78.966 +RightSiblingPOS - size: 44 # 79.174 +PredParentPOS - size: 30 # 79.381 +ArgPOS+PredLemmaSense - size: 5132 # 84.738 +ArgDeprel+PredLemmaSense - size: 4126 # 86.150 +LeftWord+PredLemmaSense - size: 9495 # 86.752 +DeprelPath+PredParentWord - size: 4398 # 87.272 +ArgDeprel+POSPath - size: 1501 # 87.417 +RightPOS+RightSiblingPOS - size: 602 # 87.604 +PredLemmaSense+RightSiblingPOS - size: 7665 # 87.419 +ArgWord+PredLemmaSense - size: 19264 # 87.732 +ArgFeats+PredLemmaSense - size: 17269 # 88.190 +ArgDeprel+ChildPOSSet - size: 799 # 88.419 +LeftSiblingPOS+RightSiblingPOS - size: 717 # 88.648 +Position+PredLemmaSense - size: 1902 # 88.794 +PredPOS+PredParentWord - size: 2110 # 89.002 +ArgWord+PredLemma - size: 18445 # 89.106 diff --git a/tools/srl-20131216/featuresets/ger/ai.feats b/tools/srl-20131216/featuresets/ger/ai.feats new file mode 100644 index 0000000..15c1094 --- /dev/null +++ b/tools/srl-20131216/featuresets/ger/ai.feats @@ -0,0 +1,21 @@ +V +PredPOS +PredLemma +PredLemmaSense +ChildDepSet +ArgWord +ArgPOS +ArgDeprel +LeftWord +LeftSiblingPOS +RightSiblingWord +DeprelPath +POSPath +ArgWord+PredLemmaSense +ArgPOS+PredLemmaSense +ArgPOS+RightSiblingPOS +ArgFeats+LeftPOS +LeftFeats+PredParentPOS +ChildDepSet+POSPath +ArgWord+PredLemma +ChildPOSSet+PredDeprel diff --git a/tools/srl-20131216/featuresets/ger/pd.feats b/tools/srl-20131216/featuresets/ger/pd.feats new file mode 100644 index 0000000..d938ec0 --- /dev/null +++ b/tools/srl-20131216/featuresets/ger/pd.feats @@ -0,0 +1,11 @@ +V +PredFeats +PredParentWord +PredParentPOS +PredParentFeats +DepSubCat +ChildDepSet +ChildWordSet +ChildPOSSet +DepSubCat+PredFeats +DepSubCat+PredParentFeats diff --git a/tools/srl-20131216/featuresets/ger/pi.feats b/tools/srl-20131216/featuresets/ger/pi.feats new file mode 100644 index 0000000..0255c4d --- /dev/null +++ b/tools/srl-20131216/featuresets/ger/pi.feats @@ -0,0 +1,12 @@ +V +PredWord +PredLemma +PredParentWord +PredParentPOS +ChildDepSet +ChildWordSet +ChildWordSet+ChildDepSet +ChildPOSSet +ChildPOSSet+ChildDepSet +DepSubCat +PredDeprel diff --git a/tools/srl-20131216/featuresets/spa/ac.feats b/tools/srl-20131216/featuresets/spa/ac.feats new file mode 100644 index 0000000..855256b --- /dev/null +++ b/tools/srl-20131216/featuresets/spa/ac.feats @@ -0,0 +1,17 @@ + +ArgDeprel - size: 40 # 50.718 +PredLemmaSense - size: 4318 # 69.992 +ArgWord - size: 14758 # 78.809 +RightWord - size: 13337 # 79.771 +PredLemma - size: 2436 # 80.966 +ChildDepSet - size: 36 # 81.791 +LeftWord - size: 1223 # 82.487 +ArgFeats - size: 51 # 82.753 +Position - size: 3 # 83.028 +RightSiblingWord - size: 9942 # 83.192 +RightPOS - size: 12 # 83.415 +RightFeats - size: 58 # 83.441 +PredFeats+PredLemmaSense - size: 85344 # 84.188 +ArgWord+RightWord - size: 42446 # 84.584 +ArgPOS+PredLemma - size: 8440 # 84.833 +ArgWord+ChildDepSet - size: 47488 # 83.53 diff --git a/tools/srl-20131216/featuresets/spa/ai.feats b/tools/srl-20131216/featuresets/spa/ai.feats new file mode 100644 index 0000000..097d4b8 --- /dev/null +++ b/tools/srl-20131216/featuresets/spa/ai.feats @@ -0,0 +1,13 @@ + +POSPath - size: 768 # 67.769 +ArgDeprel - size: 39 # 89.799 +DeprelPath - size: 1381 # 89.940 +LeftSiblingWord - size: 7889 # 89.978 +ArgPOS - size: 11 # 89.996 +LeftFeats - size: 57 # 90.069 +RightSiblingWord - size: 9074 # 90.112 +ArgWord - size: 13586 # 90.133 +LeftWord+POSPath - size: 2807 # 90.198 +DepSubCat+DeprelPath - size: 15319 # 90.291 +DeprelPath+Position - size: 1553 # 90.364 +POSPath+RightSiblingFeats - size: 9681 # 90.456 diff --git a/tools/srl-20131216/featuresets/spa/pd.feats b/tools/srl-20131216/featuresets/spa/pd.feats new file mode 100644 index 0000000..be15ac7 --- /dev/null +++ b/tools/srl-20131216/featuresets/spa/pd.feats @@ -0,0 +1,7 @@ + +ChildDepSet - size: 36 # 82.59 +PredFeats - size: 29 # 83.82 +ChildWordSet - size: 15446 # 84.51 +PredWord - size: 9404 # 84.56 +ChildDepSet+PredPOS - size: 75 # 84.77 +ChildPOSSet - size: 11495 # 84.97 diff --git a/tools/srl-20131216/featuresets/spa/pi.feats b/tools/srl-20131216/featuresets/spa/pi.feats new file mode 100644 index 0000000..ad5aa80 --- /dev/null +++ b/tools/srl-20131216/featuresets/spa/pi.feats @@ -0,0 +1,18 @@ + +PredWord +PredLemma +PredParentWord +PredParentPOS +ChildDepSet +ChildWordSet +ChildWordSet+ChildDepSet +ChildPOSSet +ChildPOSSet+ChildDepSet +DepSubCat +PredDeprel +## +## NOTE: These features are just copied from the English dir. +## A feature selection should be run to fix this. +## This is an initial draft just to execute the system. +## +## diff --git a/tools/srl-20131216/lib/APACHE-LICENSE-2.0.txt b/tools/srl-20131216/lib/APACHE-LICENSE-2.0.txt new file mode 100644 index 0000000..d645695 --- /dev/null +++ b/tools/srl-20131216/lib/APACHE-LICENSE-2.0.txt @@ -0,0 +1,202 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/tools/srl-20131216/lib/GPL.txt b/tools/srl-20131216/lib/GPL.txt new file mode 100644 index 0000000..d159169 --- /dev/null +++ b/tools/srl-20131216/lib/GPL.txt @@ -0,0 +1,339 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + , 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/tools/srl-20131216/lib/LIBNOTES b/tools/srl-20131216/lib/LIBNOTES new file mode 100644 index 0000000..5812b2d --- /dev/null +++ b/tools/srl-20131216/lib/LIBNOTES @@ -0,0 +1,72 @@ +This file briefly describes the role of the libraries which are used +by the SRL system and references their associated licenses. + +------------------------------------------------------------------------ +anna-3-3.jar + +mate-tools anna, Downloaded 2012-11-16 10:20 CET +Homepage: http://code.google.com/p/mate-tools/ +License: GPL (see GPL.txt) + +The jar from mate-tools. +Used for lemmatizing, pos tagging, morphological tagging and +dependency parsing. + +------------------------------------------------------------------------ +liblinear-1.51-with-deps.jar + +liblinear (java implementation), version 1.51 +Homepage: http://www.bwaldvogel.de/liblinear-java/ +License: own license (see LL-LICENSE.txt) + +Powerful machine-learning library. +Used for training models in the SRL system. + +------------------------------------------------------------------------ +opennlp-tools-1.5.2-incubating.jar + +OpenNLP Tools, version 1.5.2 +Homepage: http://opennlp.apache.org/ +License: Apache License (see APACHE-LICENSE-2.0.txt) + +Package that provides various NLP tools. +Used for the tokenization step in the pipeline. + +------------------------------------------------------------------------ +opennlp-maxent-3.0.2-incubating.jar + +Part of OpenNLP Tools, version 1.5.2 +Homepage: http://opennlp.apache.org/ +License: Apache License (see APACHE-LICENSE-2.0.txt) + +Package that provides various NLP tools. +Used for the tokenization step in the pipeline. + +------------------------------------------------------------------------ +seg.jar + +Stanford Chinese Word Segmenter, version 3.2.0 +Homepage: http://nlp.stanford.edu/software/segmenter.shtml +License: GPL (see GPL.txt) + +Chinese word segmenter. Used for Chinese. + +------------------------------------------------------------------------ +stanford-parser.jar + +Stanford Parser, version 3.2.0 +HomePage: http://nlp.stanford.edu/software/lex-parser.shtml +License: GPL (see GPL.txt) + +Stanford Parser. Used for tokenization of English and French. + +------------------------------------------------------------------------ +whatswrong-0.2.3.jar + +What's Wrong With My NLP?, version 0.2.3 +Homepage: http://code.google.com/p/whatswrong/ +License: GPL (see GPL.txt) + +A visualizer for Natural Language Processing problems. +Used by the HTTP interface to generate graphical output of depency +graphs. diff --git a/tools/srl-20131216/lib/LL-LICENSE.txt b/tools/srl-20131216/lib/LL-LICENSE.txt new file mode 100644 index 0000000..4ec3b86 --- /dev/null +++ b/tools/srl-20131216/lib/LL-LICENSE.txt @@ -0,0 +1,30 @@ +Copyright (c) 2007-2008 The LIBLINEAR Project. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +3. Neither name of copyright holders nor the names of its contributors +may be used to endorse or promote products derived from this software +without specific prior written permission. + + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. \ No newline at end of file diff --git a/tools/srl-20131216/lib/anna-3.3.jar b/tools/srl-20131216/lib/anna-3.3.jar new file mode 100644 index 0000000..0d30019 Binary files /dev/null and b/tools/srl-20131216/lib/anna-3.3.jar differ diff --git a/tools/srl-20131216/lib/liblinear-1.51-with-deps.jar b/tools/srl-20131216/lib/liblinear-1.51-with-deps.jar new file mode 100644 index 0000000..d01114a Binary files /dev/null and b/tools/srl-20131216/lib/liblinear-1.51-with-deps.jar differ diff --git a/tools/srl-20131216/lib/opennlp-maxent-3.0.2-incubating.jar b/tools/srl-20131216/lib/opennlp-maxent-3.0.2-incubating.jar new file mode 100644 index 0000000..c63f07a Binary files /dev/null and b/tools/srl-20131216/lib/opennlp-maxent-3.0.2-incubating.jar differ diff --git a/tools/srl-20131216/lib/opennlp-tools-1.5.2-incubating.jar b/tools/srl-20131216/lib/opennlp-tools-1.5.2-incubating.jar new file mode 100644 index 0000000..45e5634 Binary files /dev/null and b/tools/srl-20131216/lib/opennlp-tools-1.5.2-incubating.jar differ diff --git a/tools/srl-20131216/lib/seg.jar b/tools/srl-20131216/lib/seg.jar new file mode 100644 index 0000000..2480442 Binary files /dev/null and b/tools/srl-20131216/lib/seg.jar differ diff --git a/tools/srl-20131216/lib/stanford-parser.jar b/tools/srl-20131216/lib/stanford-parser.jar new file mode 100644 index 0000000..468096a Binary files /dev/null and b/tools/srl-20131216/lib/stanford-parser.jar differ diff --git a/tools/srl-20131216/lib/whatswrong-0.2.3.jar b/tools/srl-20131216/lib/whatswrong-0.2.3.jar new file mode 100644 index 0000000..09705cc Binary files /dev/null and b/tools/srl-20131216/lib/whatswrong-0.2.3.jar differ diff --git a/tools/srl-20131216/scripts/README b/tools/srl-20131216/scripts/README new file mode 100644 index 0000000..332ac59 --- /dev/null +++ b/tools/srl-20131216/scripts/README @@ -0,0 +1,83 @@ +This folder contains some scripts to execute the system. They all need +to be edited slightly before use with proper paths to corpora and/or +models. They are meant to be executed from the parent directory, e.g. + + cd $DIST_ROOT + sh scripts/learn.sh + +There are some comments in the script on switches etc. The amount of +memory used is typically whats required for the English CoNLL 2009 +corpora. It might be possible to push it down a bit. + + +Since the system grew out of the CoNLL 2009 ST, there are a couple of +different ways to parse a corpus: + +(i) parse_full.sh - parses a complete corpus using all steps of the + pipeline except tokenization. It pretty much + assumes that that the file contains tokens in the + second column and disregards the rest. + If the -nopi switch is used, it needs to have the + IsPred column from the CoNLL 2009 data format. + +(ii) parse_srl_only.sh - parses semantic roles only. The input is + expected to be the CoNLL 2009 data format + with proper dependency trees (ie. the + SRLonly evaluation corpus). + In order to replicate the setting of the 2009 + ST, one can use the -nopi switch to skip the + predicate identification step. + + +Then there is also the HTTP interface. This is started by the +run_http_server.sh script. Again, edit the file with proper paths +before executing it. + +NOTE: the http server depends on the java package +com.sun.net.httpserver (cf. +http://download.oracle.com/javase/6/docs/jre/api/net/httpserver +/spec/com/sun/net/httpserver/package-summary.html +and +http://blogs.sun.com/michaelmcm/entry/http_server_api_in_java +), which is not part of the real Java specification, but comes with +most (or at least some) JRE distributions. From my own experience, +it is included in the Sun Java 6 distribution* as well as the OpenJDK +Java 6**. + +[[ + *: + On a Mac: + % java -version + java version "1.6.0_17" + Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248-10M3025) + Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode) + % + + and + On pc, mandriva linux: + % /usr/lib/jvm/java-sun/bin/java -version + java version "1.6.0_15" + Java(TM) SE Runtime Environment (build 1.6.0_15-b03) + Java HotSpot(TM) 64-Bit Server VM (build 14.1-b02, mixed mode) + % + + **: + On pc, mandriva linux: + % java -version + java version "1.6.0_18" + OpenJDK Runtime Environment (IcedTea6 1.8) (mandriva-2.b18.2mdv2009.1-x86_64) + OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) + % +]] + +The graphical dependency graph output of the HTTP interface relies on +having Chinese fonts install properly locally. On Linux I had some issues +with this, but resolved them according to +http://blog.lizhao.net/2007/03/java-chinese-fonts-on-ubuntu.html + +The graphcical dependency graph output also seems to work less good +when using OpenJDK. The images get strange lines through them. The Sun +JRE seems to work fine though. + + +Feedback and questions are appreciated: anders@ims.uni-stuttgart.de diff --git a/tools/srl-20131216/scripts/learn.sh b/tools/srl-20131216/scripts/learn.sh new file mode 100644 index 0000000..47a36b3 --- /dev/null +++ b/tools/srl-20131216/scripts/learn.sh @@ -0,0 +1,38 @@ +#!/bin/sh + +## There are three sets of options that need, may need to, and could be changed. +## (1) deals with input and output. You have to set these (in particular, you need to provide a training corpus) +## (2) deals with the jvm parameters and may need to be changed +## (3) deals with the behaviour of the system + +## For further information on switches, see the source code, or run +## java -cp srl.jar se.lth.cs.srl.Learn --help + +################################################## +## (1) The following needs to be set appropriately +################################################## +CORPUS=~/corpora/conll09/spa/CoNLL2009-ST-Spanish-train.txt.pdeps #training corpus +Lang="spa" +MODEL="srl-$Lang.model" + +################################################## +## (2) These ones may need to be changed +################################################## +JAVA="java" #Edit this i you want to use a specific java binary. +MEM="4g" #Memory for the JVM, might need to be increased for large corpora. +CP="srl.jar:lib/liblinear-1.51-with-deps.jar" +JVM_ARGS="-cp $CP -Xmx$MEM" + +################################################## +## (3) The following changes the behaviour of the system +################################################## +#LLBINARY="-llbinary /home/anders/liblinear-1.6/train" #Path to locally compiled liblinear. Uncomment this and correct the path if you have it. This will make training models faster (30-40%). The models come out slightly differently compared to the java version though due to floating point arithmetics. +#RERANKER="-reranker" #Uncomment this if you want to train a reranker too. This takes about 8 times longer than the simple pipeline. + + +#Execute +CMD="$JAVA $JVM_ARGS se.lth.cs.srl.Learn $Lang $CORPUS $MODEL $RERANKER $LLBINARY" +echo "Executing: $CMD" + +$CMD + diff --git a/tools/srl-20131216/scripts/parse_full.sh b/tools/srl-20131216/scripts/parse_full.sh new file mode 100644 index 0000000..23a7708 --- /dev/null +++ b/tools/srl-20131216/scripts/parse_full.sh @@ -0,0 +1,59 @@ +#!/bin/sh + +## There are three sets of options that need, may need to, and could be changed. +## (1) deals with input and output. You have to set these (in particular, you need to provide models) +## (2) deals with the jvm parameters and may need to be changed +## (3) deals with the behaviour of the system + +## For further information on switches, see the source code, or run +## java -cp srl.jar se.lth.cs.srl.Parse --help + +################################################## +## (1) The following needs to be set appropriately +################################################## +#INPUT="/home/anders/corpora/conll09/eng/CoNLL2009-evaluation-English-SRLonly.txt" #evaluation corpus +INPUT=/home/anders/corpora/conll09/chi/CoNLL2009-ST-evaluation-Chinese-SRLonly.txt +LANG="chi" +##TOKENIZER_MODEL="models/eng/EnglishTok.bin.gz" #This is not used here anyway. The input is assumed to be segmented/tokenized already. +##LEMMATIZER_MODEL="models/chi/lemma-eng.model" +POS_MODEL="models/chi/tag-chn.model" +#MORPH_MODEL="models/ger/morph-ger.model" #Morphological tagger is not applicable to English. Fix the path and uncomment if you are running german. +PARSER_MODEL="models/chi/prs-chn.model" +SRL_MODEL="models/chi/srl-chn.model" +OUTPUT="$LANG.out" + +################################################## +## (2) These ones may need to be changed +################################################## +JAVA="java" #Edit this i you want to use a specific JRE. +MEM="4g" #Memory for the JVM, might need to be increased for large corpora. +CP="srl.jar:lib/anna.jar:lib/liblinear-1.51-with-deps.jar:lib/opennlp-tools-1.4.3.jar:lib/maxent-2.5.2.jar:lib/trove.jar:lib/seg.jar" +JVM_ARGS="-cp $CP -Xmx$MEM" + +################################################## +## (3) The following changes the behaviour of the system +################################################## +#RERANKER="-reranker" #Uncomment this if you want to use a reranker too. The model is assumed to contain a reranker. While training, the corresponding parameter has to be provided. +#NOPI="-nopi" #Uncomment this if you want to skip the predicate identification step. + + + +################################################## + +CMD="$JAVA $JVM_ARGS se.lth.cs.srl.CompletePipeline $LANG $NOPI $RERANKER -tagger $POS_MODEL -parser $PARSER_MODEL -srl $SRL_MODEL -test $INPUT -out $OUTPUT" + +if [ "$TOKENIZER_MODEL" != "" ]; then + CMD="$CMD -token $TOKENIZER_MODEL" +fi + +if [ "$LEMMATIZER_MODEL" != "" ]; then + CMD="$CMD -lemma $LEMMATIZER_MODEL" +fi + +if [ "$MORPH_MODEL" != "" ]; then + CMD="$CMD -morph $MORPH_MODEL" +fi + +echo "Executing: $CMD" + +$CMD diff --git a/tools/srl-20131216/scripts/parse_srl_only.sh b/tools/srl-20131216/scripts/parse_srl_only.sh new file mode 100644 index 0000000..e880049 --- /dev/null +++ b/tools/srl-20131216/scripts/parse_srl_only.sh @@ -0,0 +1,37 @@ +#!/bin/sh + +## There are three sets of options that need, may need to, and could be changed. +## (1) deals with input and output. You have to set these (in particular, you need to provide a training corpus) +## (2) deals with the jvm parameters and may need to be changed +## (3) deals with the behaviour of the system + +## For further information on switches, see the source code, or run +## java -cp srl.jar se.lth.cs.srl.Parse --help + +################################################## +## (1) The following needs to be set appropriately +################################################## +INPUT=~/corpora/conll09/spa/CoNLL2009-ST-evaluation-Spanish-SRLonly.txt +Lang="spa" +MODEL="./srl-spa.model" +OUTPUT="${Lang}-eval.out" + +################################################## +## (2) These ones may need to be changed +################################################## +JAVA="java" #Edit this i you want to use a specific java binary. +MEM="2g" #Memory for the JVM, might need to be increased for large corpora. +CP="srl.jar:lib/liblinear-1.51-with-deps.jar:lib/anna.jar" +JVM_ARGS="-cp $CP -Xmx$MEM" + +################################################## +## (3) The following changes the behaviour of the system +################################################## +#RERANKER="-reranker" #Uncomment this if you want to use a reranker too. The model is assumed to contain a reranker. While training, this has to be set appropriately. +NOPI="-nopi" #Uncomment this if you want to skip the predicate identification step. This setting is equivalent to the CoNLL 2009 ST. + + +CMD="$JAVA $JVM_ARGS se.lth.cs.srl.Parse $Lang $INPUT $MODEL $RERANKER $NOPI $OUTPUT" +echo "Executing: $CMD" + +$CMD diff --git a/tools/srl-20131216/scripts/run_anna_http_server.sh b/tools/srl-20131216/scripts/run_anna_http_server.sh new file mode 100644 index 0000000..413228d --- /dev/null +++ b/tools/srl-20131216/scripts/run_anna_http_server.sh @@ -0,0 +1,57 @@ +#!/bin/sh + +## There are three sets of options that need, may need to, and could be changed. +## (1) deals with input and output. You have to set these (in particular, you need to provide models) +## (2) deals with the jvm parameters and may need to be changed +## (3) deals with the behaviour of the system + +################################################## +## (1) The following needs to be set appropriately +################################################## +Lang="eng" +MODELDIR=`dirname $0`/../../models/eng/ +#TOKENIZER_MODEL=${MODELDIR}/en-token.bin #If tokenizer is blank, it will use some default (Stanford for English, Exner for Swedish, and whitespace otherwise) +#TOKENIZER_MODEL="models/chi/stanford-chinese-segmenter-2008-05-21/data" #Use this for chinese. +LEMMATIZER_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model +POS_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model +#MORPH_MODEL=${MODELDIR}/ #No morph model for English. +PARSER_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.parser.model + +PORT=8073 #The port to listen on + +################################################## +## (2) These ones may need to be changed +################################################## +JAVA="java" #Edit this i you want to use a specific java binary. +MEM="4g" #Memory for the JVM, might need to be increased for large corpora. +DIST_ROOT=`dirname $0`/.. +CP=${DIST_ROOT}/srl.jar +for jar in ${DIST_ROOT}/lib/*.jar; do +# echo $jar + CP=${CP}:$jar +done +#exit 0; +JVM_ARGS="-Djava.awt.headless=true -cp $CP -Xmx$MEM" +# The java.awt.headless property is needed to render the images of dependency graphs if the server is executed remotely (and there is no GUI stuff involved anyway) + +################################################## +## (3) The following changes the behaviour of the system +################################################## +#RERANKER="-reranker" #Uncomment this if you want to use a reranker too. The model is assumed to contain a reranker. While training, the corresponding parameter has to be provided. + +CMD="$JAVA $JVM_ARGS se.lth.cs.srl.http.AnnaHttpPipeline $Lang $RERANKER -tagger $POS_MODEL -parser $PARSER_MODEL -port $PORT" + +if [ "$TOKENIZER_MODEL" != "" ]; then + CMD=${CMD}" -token $TOKENIZER_MODEL" +fi + +if [ "$LEMMATIZER_MODEL" != "" ]; then + CMD="$CMD -lemma $LEMMATIZER_MODEL" +fi + +if [ "$MORPH_MODEL" != "" ]; then + CMD="$CMD -morph $MORPH_MODEL" +fi + +echo "Executing: $CMD" +$CMD diff --git a/tools/srl-20131216/scripts/run_http_server.sh b/tools/srl-20131216/scripts/run_http_server.sh new file mode 100644 index 0000000..5537825 --- /dev/null +++ b/tools/srl-20131216/scripts/run_http_server.sh @@ -0,0 +1,61 @@ +#!/bin/sh + +## There are three sets of options that need, may need to, and could be changed. +## (1) deals with input and output. You have to set these (in particular, you need to provide models) +## (2) deals with the jvm parameters and may need to be changed +## (3) deals with the behaviour of the system + +## For further information on switches, see the source code, or run +## java -cp srl-20100902.jar se.lth.cs.srl.http.HttpPipeline + +################################################## +## (1) The following needs to be set appropriately +################################################## +Lang="eng" +MODELDIR=`dirname $0`/../../models/eng/ +#TOKENIZER_MODEL=${MODELDIR}/en-token.bin #If tokenizer is blank, it will use some default (Stanford for English, Exner for Swedish, and whitespace otherwise) +#TOKENIZER_MODEL="models/chi/stanford-chinese-segmenter-2008-05-21/data" #Use this for chinese. +LEMMATIZER_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model +POS_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model +#MORPH_MODEL=${MODELDIR}/ #No morph model for English. +PARSER_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.parser.model +SRL_MODEL=${MODELDIR}/CoNLL2009-ST-English-ALL.anna-3.3.srl.model + +PORT=8072 #The port to listen on + +################################################## +## (2) These ones may need to be changed +################################################## +JAVA="java" #Edit this i you want to use a specific java binary. +MEM="4g" #Memory for the JVM, might need to be increased for large corpora. +DIST_ROOT=`dirname $0`/.. +CP=${DIST_ROOT}/srl.jar +for jar in ${DIST_ROOT}/lib/*.jar; do +# echo $jar + CP=${CP}:$jar +done +#exit 0; +JVM_ARGS="-Djava.awt.headless=true -cp $CP -Xmx$MEM" +# The java.awt.headless property is needed to render the images of dependency graphs if the server is executed remotely (and there is no GUI stuff involved anyway) + +################################################## +## (3) The following changes the behaviour of the system +################################################## +#RERANKER="-reranker" #Uncomment this if you want to use a reranker too. The model is assumed to contain a reranker. While training, the corresponding parameter has to be provided. + +CMD="$JAVA $JVM_ARGS se.lth.cs.srl.http.SRLHttpPipeline $Lang $RERANKER -tagger $POS_MODEL -parser $PARSER_MODEL -srl $SRL_MODEL -port $PORT" + +if [ "$TOKENIZER_MODEL" != "" ]; then + CMD=${CMD}" -token $TOKENIZER_MODEL" +fi + +if [ "$LEMMATIZER_MODEL" != "" ]; then + CMD="$CMD -lemma $LEMMATIZER_MODEL" +fi + +if [ "$MORPH_MODEL" != "" ]; then + CMD="$CMD -morph $MORPH_MODEL" +fi + +echo "Executing: $CMD" +$CMD diff --git a/tools/srl-20131216/srl-src.jar b/tools/srl-20131216/srl-src.jar new file mode 100644 index 0000000..d4fab25 Binary files /dev/null and b/tools/srl-20131216/srl-src.jar differ diff --git a/tools/srl-20131216/srl.jar b/tools/srl-20131216/srl.jar new file mode 100644 index 0000000..9579916 Binary files /dev/null and b/tools/srl-20131216/srl.jar differ