WorkInProgress

From USYVL Development Wiki
Jump to navigation Jump to search

Not sure of this section 2011-10

overhauled the poolLayout stuff I was working on last week. This does not touch the system that assigns teams into a given pool, only what the pool game match-ups look like (just shuffling the pool matrices around). I had to change the way the pool layout data was managed, to allow expressing more options, more compactly and with less repetition of data.

I also added a setting entry allowing 3 options for choosing the pool layout data:

  • Fixed - original single layout template for pools based on number of teams
  • Site count - fixed variations of templates based on number of sites with teams in pool and number of teams.
  • minimize collisions - loops over all variations of templates (fixed, site count, + any others provided) checking on number of collisions. Returns first variation encountered with the lowest collision count. This way, less desirable variations can be loaded last, to minimize chances of it being used.

This should allow us to go back to the old system completely with the change of that setting and reprocessing the data (I say should only because the data management is handled very differently now, and I didn't want to keep the old data structures coded in when all the same data should be converted to the new system)

Tested some early spring 2009 data,

collisions out of 1384 matches: 
598 fixed, 
500 site count, 
462 minimize collisions.

This number should go down significantly as we add team data, since many tournaments have team data for only one team (can't avoid 100% collisions in that case :-).

Beyond this, improvements will most likely involve changing the pool assignment algorithms. I have thought of one variation that would require looking at site/team counts in a pool and handle the case of one site being larger than two others. Instead of trying to maximize site variation in the pool, it may work better to pool the two smaller sites together and treat them as one team (as far as pool assignment goes). The two smaller sites may not have many matches against each other, but the larger site would not have lots of collisions either....

I have some ideas on how to handle that, but want to see what issues we still see once we get the team data in place.

The only possible problem that I envision with the new system is the slight possibility that the minimization might "undo" some of the variations that are applied to subsequent tournaments to try to have different team match ups on those later tournaments. I think the chances of that are small, and probably not overly critical.


Did some finds and diffs to verify what files are created by what prog and when: a_reset b_geocode c_bpi d_ncf e_gts f_wtsl g_uta

Geocode

fablio-mpro:~/Sites/sites/usyvl/scheduling[37]$ diff a_reset b_geocode 
> ./input/2009_A_Spring-locations_geocodes.kml
> ./input/2009_A_Spring-locations_geocodes.tab
> ./input/2009_A_Spring-locations_manual.tab
> ./input/2009_A_Spring-locations_zipcodes.tab
> ./schedules/allsites.kml

BuildProximityInput

fablio-mpro:~/Sites/sites/usyvl/scheduling[38]$ diff b_geocode c_bpi
> ./input/2009_A_Spring-proximity
> ./input/2009_A_Spring-proximity-distances
> ./input/2009_A_Spring-proximity-failures
> ./input/2009_A_Spring-proximity-standalone

NeighboringCommunityFind

fablio-mpro:~/Sites/sites/usyvl/scheduling[39]$ diff c_bpi d_ncf 
> ./input/NC-data.ser
> ./schedules/2009_A_Spring-si
> ./schedules/2009_A_Spring-si/index.htm
> ./schedules/2009_A_Spring-si/Index.html
> ./schedules/NeighboringCommunities.html
> ./schedules/NeighboringCommunitiesCMI.html

GroupTournSites

fablio-mpro:~/Sites/sites/usyvl/scheduling[40]$ diff d_ncf e_gts  
> ./input/2009_A_Spring-tourn-group

WeightTSLPreferred

fablio-mpro:~/Sites/sites/usyvl/scheduling[41]$ diff e_gts f_wtsl 
> ./input/TSL_preferred_weighted

fablio-mpro:~/Sites/sites/usyvl/scheduling[42]$ diff f_wtsl g_uta


sqlite goodies

import data into sqlite via command line

> sqlite3 data.db
SQLite version 3.5.9
Enter ".help" for instructions
sqlite> CREATE TABLE data (name STRING, address STRING, salary INTEGER);
sqlite> .mode csv
sqlite> .import out.csv data
sqlite> .quit
>


Did this on 2012-09-02 to run some tests

CREATE TABLE ev ( program text, type text, location text,beg text, end text, address text, addr2 text, city text , state text, zip text,cmiid integer,otherid integer );
CREATE TABLE tm (junk text, name text, program text,division text,unk text,unk2 text,num integer,coach text);


there is also a tab mode that should work in much the same way...

sqlite> .mode tab

csv2tsv

#!/usr/bin/perl -s

# csv2tsv - filter comma-separated text (csv) into tab-separated text (tsv)
# Steve Kinzler, kinzler@cs.indiana.edu, Jul 03
# http://www.cs.indiana.edu/~kinzler/home.html#unix

# NOTE: CSV text may be double-quoted and may include commas in quotes,
#       TSV is assumed to never be quoted and all tabs are separators

use Text::CSV;
#se Text::CSV_XS;	# optional, speeds up Text::CSV

die "usage: $0 [ file ... ]\n" if $h;

$csv = Text::CSV->new;

while (<>) {
	s/[\r\n]*$//;

	warn("$0: warning, changing tab into 8 spaces in line $. [$_]\n")
		if s/\t/        /g;

	warn("$0: skipping line, parse failed in line $. [$_]\n"), next
		unless $csv->parse($_);

	print join("\t", $csv->fields()), "\n";
}