'분류 전체보기' 카테고리의 글 목록 (3 Page)

Machine Learning/CRF | Posted by 알 수 없는 사용자 2008. 8. 29. 09:38

CRF++

Learning :
crf_learn template_file train_file model_file

crf_learn -f 2 template_file train_file model_file # -f NUM : threshould for the feature
crf learn -a MIRA template_file train_file model_file # -a MIRA : learn with MIRA
crf_learn -c 1.5 template_file train_file model_file # -c float : balance between overfitting and underfitting

Test:
crf_test -m model_file test_file

crf_test -n 20 -m model_file test_file # -n NUM : N-best results

Machine Learning | Posted by 알 수 없는 사용자 2008. 7. 15. 13:12

semi-supervised learning -- mitchael

Reading List | Posted by 알 수 없는 사용자 2008. 7. 9. 12:53

Advanced Topics in Machine Learning

Instructor
	Thorsten Joachims, tj@cs.cornell.edu,
Syllabus
	In particular, the course will cover the following main topics: Part 1: Support Vector Machines and Related Methods: Perceptron, optimal hyperplane and maximum-margin separation, soft-margin, SVMs for regression, Gaussian Processes, Boosting, regularized regression methods Learning with Kernels: properties, real-valued feature vectors, sequences and other structured data, Fisher kernels Statistical Learning Theory: no free lunch, VC theory, PAC-Bayesian, bias/variance, error bounds, leave-one-out bounds Error Estimation and Model Selection: leave-one-out and cross-validation, holdout testing, bootstrap estimation Part 2: Transductive Learning: How can one use unlabeled data to improve performance in supervised learning? What is the information contained in unlabeled data? What assumptions do we need to make? How can we design efficient algorithms? Learning Complex Structures: What if the target function is more complex than in classification or regression? For example, the goal could be not a binary classification function, but an ordering (i.e. retrieval) function for information retrieval. Or, what if the input to the learning is not a classification, but merely pair-wise preferences like "A is preferred over B"? Learning Kernels: The kernel defines the inductive bias of the learning algorithm and is key to achieving good performance. This makes selecting a kernel one of the most crucial design decisions. How can we automate the selection process? In particular, how can one construct a good kernel from data? What are the situations where this might work? What are the assumptions? Methods and theory will be illustrated with practical examples, in particular from the areas of information retrieval and language technology.
Lecture Notes, Slides, and Handouts
	Lecture notes and slides are also handed out in class: 01/23: Linear Classifiers and Perceptron (PDF slides) 01/30: Optimal Hyperplane (PDF slides) 02/04: Optimization Theory and Duality for SVMs (PDF slides) 02/13: Training SVMs (PDF slides) (demo slides) 02/20: Leave-one-Out and Expected Error (PDF slides) 02/27: Kernels (PDF slides) 03/06: Statistical Learning Theory (PDF slides)
Homework Assignments
	Homework 1: Perceptron and Optimal Hyperplanes Datasets: earn.zip, earn.tar.gz, adult.zip, adult.tar.gz Support Vector Machine Software: SVM-light Filip's data access code Homework 2: Training SVMs and Leave-One-Out Homework 3: Kernels and Statistical Learning Theory Dataset: liver.tar
Readings
	We will read some of the following papers in the second half of the course: Learning Rankings: William W. Cohen, Robert E. Schapire, Yoram Singer, Learning to order things, Journal of Artificial Intelligence Research, 10, 1999. (Steven, 4/15) Y. Freund, R. Iyer, R. Schapire, and Y. Singer, An efficient boosting algorithm for combining preferences, ICML, 1998. (Scott, 4/17) T. Joachims, Optimzing Search Engines using Clickthrough Data, KDD Conference, 2002. (Thorsten, 4/10) R. Herbrich, T. Graepel, and K. Obermayer. Large Margin Rank Boundaries for Ordinal Regression. Advances in Large Margin Classifiers , pages 115-132, 2000. (Thorsten, 4/8) R. Caruana, S. Baluja, and T. Mitchell, Using the Future to `Sort Out' the Present: Rankprop and Multitask Learning for Medical Risk Evaluation, NIPS, 1995. (Rich, 4/10) Transductive Learning / Learning from Labeled and Unlabeled Data: K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning, 39(2/3). pp. 103-134. 2000. (Mark, 4/22) T. Joachims, Transductive Inference for Text Classification using Support Vector Machines. ICML, 1999. (Thorsten, 4/17) A. Blum and T. Mitchell. Combining Labeled and Unlabeled Data with Co-Training, COLT, 1998. (Andy, 4/24) O. Chapelle, J. Weston and B. Schölkopf, Cluster Kernels for Semi-Supervised Learning. NIPS, 2003. (Phil, 4/22) M. Szummer and T. Jaakkola, Partially labeled classification with Markov random walks, NIPS, 2001. (Filip, 4/22) A. Blum, S. Chawla, Learning from Labeled and Unlabeled Data using Graph Mincuts. ICML, 2001. (Alan, 4/24) Learning to Learn / Learning Kernels: R. Caruana, Multitask Learning. Machine Learning 28(1): 41-75, 1997. (Rich, 4/29) Sebastian Thrun and Joseph O'Sullivan, Discovering Structure in Multiple Learning Tasks: The TC Algorithm, ICML, 1996. (Stefan, 4/29) N. Cristianini, J. Kandola, A. Elisseeff, and J. Shawe-Taylor, On Kernel Target Alignment, JMLR. (Thorsten, 5/1) T. Jaakkola and D. Haussler. Exploiting generative models in discriminative classifiers, NIPS, 1998. (Joshua, 5/1) Other topics: B. Schölkopf, J. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the support of a high-dimensional distribution. Technical Report 99-87, Microsoft Research, 1999. To appear in Neural Computation, 2001. and Ben-Hur et al., Support Vector Clustering. JMLR, 2, 2001. A. J. Smola and B. Schölkopf. A tutorial on support vector regression. NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK, 1998. To appear in Statistics and Computing, 2001. (pages 1-14 only) (Jingbo, 4/24) B. Schölkopf, A. Smola, K. Müller, Kernel Principal Component Analysis, in: B. Scholkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods --- Support Vector Learning. MIT Press, Cambridge, MA, 1999. 327 -- 352. Short version or chapter in Support Vector Learning for background. (Liviu, 4/17) John Platt, Large-Margin DAGs for Multi-Class Classification, NIPS 2000. (Alex, 4/15)

Reading List | Posted by 알 수 없는 사용자 2008. 7. 8. 15:33

semi-supervised learning for structured variables

Abstract
We present a novel, semi-supervised approach to training discriminative random fields (DRFs) that efficiently exploits labeled and unlabeled training data to achieve improved accuracy in a variety of image processing tasks. We formulate DRF training as a form of MAP estimation that combines conditional loglikelihood on labeled data, given a data-dependent prior, with a conditional entropy regularizer defined on unlabeled data. Although the training objective is no longer concave, we develop an efficient local optimization procedure that produces classifiers that are more accurate than ones based on standard supervised DRF training. We then apply our semi-supervised approach to train DRFs to segment both synthetic and real data sets, and demonstrate significant improvements over supervised DRFs in each case. 1 Learning to model spatial dependency: Semi-supervised discriminative random fields
by Chi-hoon Lee, Feng Jiao, Shaojun Wang, Dale Schuurmans, Russell Greiner — 2006 — Advances in Neural Information Processing Systems (NIPS) 19
Cited by 1 – Show/Hide Context – Add To MetaCart

Abstract
This paper explores structured multi-label transductive clustering for learning lexical part-of-speech information in sparse-data scenarios. We propose three ways of extending existing transductive clustering schemes for binary label assignment to the multi-label case. Preliminary experimental results demonstrate that appropriately defined priors on label assignment hypotheses are crucial for obtaining good performance. 1 2005b. Structured multi-label transductive learning
by Kevin Duh, Katrin Kirchhoff — In NIPS Workshop on Advances in Structured Learning for Text/Speech Processing
Cited by 1 – Show/Hide Context – Add To MetaCart

Abstract
Abstract not found 1 Contents
by Xiaojin Zhu — 2006
Show/Hide Context – Add To MetaCart

Machine Learning | Posted by 알 수 없는 사용자 2008. 7. 8. 01:38

Category of Machine Learning

Machine learning algorithms are organized into a taxonomy, based on the desired outcome of the algorithm. Common algorithm types include:

Supervised learning — in which the algorithm generates a function that maps inputs to desired outputs. One standard formulation of the supervised learning task is the classification problem: the learner is required to learn (to approximate) the behavior of a function which maps a vector into one of several classes by looking at several input-output examples of the function.
Unsupervised learning — An agent which models a set of inputs: labeled examples are not available.
Semi-supervised learning — which combines both labeled and unlabeled examples to generate an appropriate function or classifier.
Reinforcement learning — in which the algorithm learns a policy of how to act given an observation of the world. Every action has some impact in the environment, and the environment provides feedback that guides the learning algorithm.
Transduction — similar to supervised learning, but does not explicitly construct a function: instead, tries to predict new outputs based on training inputs, training outputs, and test inputs which are available while training.
Learning to learn — in which the algorithm learns its own inductive bias based on previous experience.

The computational analysis of machine learning algorithms and their performance is a branch of theoretical computer science known as computational learning theory.

Linux & Ubuntu/Linux tips | Posted by 알 수 없는 사용자 2008. 7. 7. 10:06

grep

print out all lines contain （完）:

$> grep -w '（完）' train.UCUP.utf8

Linux & Ubuntu/Linux tips | Posted by 알 수 없는 사용자 2008. 7. 4. 16:27

How to know the version of ubuntu you're running

cate \etc\issue

Ubuntu 8.04.1 \n\l

Programming/Python | Posted by 알 수 없는 사용자 2008. 6. 23. 12:13

command Line Arguments

sys.argv
is an array that contains the command-line arguments passed to the program

for example,
>> python file2DIC -f CityU.utf8.seg -d CityU.DIC

then the value of sys.argv is :
sys.argv [ 0 ] = "file2DIC"
sys.argv

[ 1 ] = "-f"
sys.argv [ 2 ] = "CityU.utf8.seg"
sys.argv [ 3 ] = "-d"
sys.argv [ 4 ] = "CityU.DIC"

remember, if you'd like to use it in python, don't forget to add the following lines in your programming:
import sys

Programming/Python | Posted by 알 수 없는 사용자 2008. 6. 23. 12:10

pickle & unpickle

import pickle

# pickling list
f = open ( "pickle.dat", "w" )
mixList = [ 'one', 1 'two', 2 ]
pickle.dump ( mixList,

f )
f.close ()

#unpickling
f = open ( "pickle.dat", "r" )
mixList = pickle.load ( f )
print mixList
f.close ()

Programming/Python | Posted by 알 수 없는 사용자 2008. 5. 11. 02:07

function dic

Functions as object

switch = {
'one' : function1,
'two' : function2,
'three' : function3
}

try:
result = switch[choice]
except KeyError :
print 'I didn't understand your choice.'
else :
result ()

Apple Tree

'Apple Tree'에 해당되는 글 57건

CRF++

semi-supervised learning -- mitchael

Advanced Topics in Machine Learning

Learning Rankings:

Transductive Learning / Learning from Labeled and Unlabeled Data:

Learning to Learn / Learning Kernels:

Other topics:

semi-supervised learning for structured variables

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Abstract

Category of Machine Learning

grep

How to know the version of ubuntu you're running

command Line Arguments

pickle & unpickle

function dic

Category

Tag

Notice

Calendar

Recent Post

Recent Comment

Recent Trackback

Archive

My Link

티스토리툴바

« » 2024.5
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31