Recent activity
Subscribe to this feed
dpurp replied on June 12, 2009 04:21 to the problem "Firefox 3.5b4 compatibility issues" in quub:
dpurp shared an idea in LUCI on November 26, 2008 17:21:
mysql hints- set the mysql key_buffer_size to ~30% of your total memory (default is something like 16mb)
- use the myisam database engine
- after importing (see Mo's idea page [link]):
- remove all pages from the "page" table that do not have the 0 page_namespace or are not redirects
-
- establish primary keys/indexes for id/join columns for tables (e.g. page_id, page_latest, rev_id, rev_text_id, old_id, etc...)
- test out different ways of making queries. remember your goal is to use the indexes in the table, which depending on how you form your query, mysql may not actually use. if you put the word "explain" in front of your mysql query it will show you how it is making the query, and which indexes it will use). for me, a bad query to join the 3 tables took ~20min, a good query took ~3sec.
- set the mysql key_buffer_size to ~30% of your total memory (default is something like 16mb)
A comment on the question "Wikipedia Dump Woes" in LUCI:
also... if you are using linux, be sure you are using the sun's java runtime instead of gcj (ubuntu's default), changing made a significant difference for me in the import process – dpurp, on November 23, 2008 09:17
dpurp replied on November 23, 2008 02:04 to the question "Wikipedia Dump Woes" in LUCI:
I was able to download the static html dump in ~1hr (thank you uci housing bandwidth). however, uncompressed the tar file is ~200gb, which does seem to work well with my 160gb hard drive.
on ubuntu, setting up mediawiki is extremely easy
- sudo apt-get install mediawiki
- navigate to http://localhost/mediawiki
- complete configuration
i'm currently importing the xml file into mysql.
as far as the mediawiki syntax... i'm thinking about stripping all non-alphabetical characters from the text. unless there's something i haven't thought of, these aren't necessary, and will remove the wiki syntax at the same time.
stemming:
surprisingly there do not seem to be very many stemmers available. as far as i can tell the big name stemmer is the porter algorithm (java version available here: http://snowball.tartarus.org/runtime/...). if you have lucene available, these classes are in it as well.-
dpurp started following the idea "Some useful links" in LUCI.
A comment on the idea "Some hints for Assignment 04" in LUCI:
it didn't work for me when i had my extension in My Docs (Windows), but when i moved it to C:\extensions\my_extension\ as the tutorial suggested it worked fine. Perhaps, it was a problem due to the spaces in the my docs file path??? – dpurp, on November 09, 2008 22:46
dpurp replied on November 09, 2008 22:11 to the idea "Some hints for Assignment 04" in LUCI:
Instead of repackaging your extension after each modification, you can just create a link to your source directory, and firefox will automatically load the latest content each time the sidebar is opened. (read Testing in Creating a Firefox sidebar)
Once you have created this link in your extensions folder. Opening and closing the sidebar in firefox is enough to reload your extension (you don't have to restart firefox). Note: unless your code removes the event listeners for your extension when the sidebar is closed, the listeners will continue to exist until firefox is closed.
dpurp replied on November 09, 2008 21:59 to the idea "Catching load events of the main window from your sidebar." in LUCI:
The second link is particularly helpful in getting event notifications from opened content windows.
Those who don't have page loads working yet:
pay attention to the section on Finding already opened windows and read the page it links to: NSIWindowMediator
dpurp replied on November 09, 2008 08:09 to the idea "Accessing links in the page" in LUCI:
dpurp replied on October 18, 2008 17:22 to the idea "Palindrome examples" in LUCI:
Loading Profile...

