Stata-Tex: Create custom
	      LaTeX tables from Stata. 
 It's often necessary
	      to produce an output table that doesn't fit any of the
	      format provided by the standard tools outreg, esttab,
	      etc.. Some examples:
- Showing p-values for differences of coefficients
- Putting certain coefficients in bold or colors
- Putting different outcome variables in the same row of a regression table
- Multi-panel tables with different formatting in each panel
Stata-tex allows you to separate the LaTeX table template from the table data. This lets you set up and compile exactly the LaTeX table you want, with placeholders for the data. Then you can generate the data separately, and transfer it into the LaTeX table automatically.
  Masala Merge: Fuzzy matching of Hindi (or any) names. 
 This is
	      Python and Stata code for fuzzy merging Hindi names. The
	      algorithm is based on the Levenshtein edit distance
	      algorithm, which calculates the number of edits,
	      deletions and insertions required to get from one word
	      to another. We modified this to lower the cost of
	      certain substitutions that are common to Hindi,
	      e.g. KS->X would have a cost of 2 in Levenshtein - we
	      assign it a cost of 0.2. Modifying this code for another
	      language consists only of changing this list of costs.
In addition to calculating edit distances, the program usees a default set of "smart" thresholds to determine which fuzzy matches to accept and which ones to reject. The premise is that you can tolerate a higher edit distance when matching very long words. The smart thresholds also reject matches if the next nearest match is very close - even an exact match should be considered uncertain if there's another very close match.
	      We calibrated costs, common substitutions and smart
	      distance thresholds by analyzing results from
	      a set of 500,000 known village name matches from the 1991 and
	      2001 population censuses. Higher or lower thresholds
	      may be desirable depending on what you are trying to
	      match. We use very conservative thresholds; we put a
	      much higher cost on incorrect matches than on missed
	      matches. You can raise or lower all thresholds
	      proportionately with the fuzziness() parameter. The
	      default is 1.
	      There might be a better fuzzy matching program out there
	      - if so, please let me know about it!  On location name
	      matches, masala-merge consistently outperforms Stata's
	      reclink. But reclink's string similarity algorithm is
	      going to do better, for example, if you want to match
	      "Dell Inc." to "Dell Incorporated".
	    
If you want to optimize this for another language, please let me know about it! Just a few lines need to be changed (but you need to figure out the common substitutions), and I'd be happy to post the new language function here.
