0、概述

LSTM 网络

LSTMs明确旨在避免长期依赖的问题。长期记忆信息实际上是他们的默认行为，而不是他们努力学习的东西.！

LSTMs also have this chain like structure, but the repeating module has a different structure. Instead of having a single neural network layer, there are four, interacting in a very special way.

LSTMs也有这种链状结构，但重复模块具有不同的结构。对比RNN仅有一个单一的神经网络层，LSTMs有四个，并以一种非常特殊的方式交互。

The repeating module in an LSTM contains four interacting layers.重复的模块中包含四个相互作用的层次对应。

Don’t worry about the details of what’s going on. We’ll walk through the LSTM diagram step by step later. For now, let’s just try to get comfortable with the notation we’ll be using.

In the above diagram, each line carries an entire vector, from the output of one node to the inputs of others. The pink circles represent pointwise operations, like vector addition, while the yellow boxes are learned neural network layers. Lines merging denote concatenation, while a line forking denote its content being copied and the copies going to different locations.

LSTMs的核心概念

The key to LSTMs is the cell state, the horizontal line running through the top of the diagram.

LSTMs的关键是单元状态，水平线贯穿图的顶部。

The cell state is kind of like a conveyor belt. It runs straight down the entire chain, with only some minor linear interactions. It’s very easy for information to just flow along it unchanged.

The LSTM does have the ability to remove or add information to the cell state, carefully regulated by structures called gates.

Gates are a way to optionally let information through. They are composed out of a sigmoid neural net layer and a pointwise multiplication operation.

LSTM有删除或添加信息到单元状态的能力，需要通过称为“门”的结构仔细调节。

“门”是一种让信息有选择通过的方法。他们组成了一个Sigmoid神经网络层和逐点乘法运算。

The sigmoid layer outputs numbers between zero and one, describing how much of each component should be let through. A value of zero means “let nothing through,” while a value of one means “let everything through!”

Sigmoid层输出零和1之间的数字，描述每个组件多少量应通过。零值意味着“什么都不让通过”，而一值的意思是“让一切通过！“

An LSTM has three of these gates, to protect and control the cell state.

LSTM 逐步讲解

LSTM的第一步是确定哪些信息从单元状态中扔掉。这个决策 由称为 “forget gate layer”的Sigmoid层完成。它观察H_t-1和x_t，对单元状态C_t-1的每一个值输出一个0~1的数字，0代表完全排除，1代表完全保留。

It’s now time to update the old cell state, Ct1

, into the new cell state Ct

. The previous steps already decided what to do, we just need to actually do it.

We multiply the old state by ft

, forgetting the things we decided to forget earlier. Then we add itC~t

. This is the new candidate values, scaled by how much we decided to update each state value.

In the case of the language model, this is where we’d actually drop the information about the old subject’s gender and add the new information, as we decided in the previous steps.

Finally, we need to decide what we’re going to output. This output will be based on our cell state, but will be a filtered version. First, we run a sigmoid layer which decides what parts of the cell state we’re going to output. Then, we put the cell state through tanh

(to push the values to be between 1 and 1

) and multiply it by the output of the sigmoid gate, so that we only output the parts we decided to.

For the language model example, since it just saw a subject, it might want to output information relevant to a verb, in case that’s what is coming next. For example, it might output whether the subject is singular or plural, so that we know what form a verb should be conjugated into if that’s what follows next.

Variants on Long Short Term Memory

What I’ve described so far is a pretty normal LSTM. But not all LSTMs are the same as the above. In fact, it seems like almost every paper involving LSTMs uses a slightly different version. The differences are minor, but it’s worth mentioning some of them.

One popular LSTM variant, introduced by Gers & Schmidhuber (2000), is adding “peephole connections.” This means that we let the gate layers look at the cell state.

The above diagram adds peepholes to all the gates, but many papers will give some peepholes and not others.

Another variation is to use coupled forget and input gates. Instead of separately deciding what to forget and what we should add new information to, we make those decisions together. We only forget when we’re going to input something in its place. We only input new values to the state when we forget something older.

A slightly more dramatic variation on the LSTM is the Gated Recurrent Unit, or GRU, introduced by Cho, et al. (2014). It combines the forget and input gates into a single “update gate.” It also merges the cell state and hidden state, and makes some other changes. The resulting model is simpler than standard LSTM models, and has been growing increasingly popular.

These are only a few of the most notable LSTM variants. There are lots of others, like Depth Gated RNNs by Yao, et al. (2015). There’s also some completely different approach to tackling long-term dependencies, like Clockwork RNNs by Koutnik, et al. (2014).

Which of these variants is best? Do the differences matter? Greff, et al. (2015) do a nice comparison of popular variants, finding that they’re all about the same. Jozefowicz, et al. (2015) tested more than ten thousand RNN architectures, finding some that worked better than LSTMs on certain tasks.

Conclusion

Earlier, I mentioned the remarkable results people are achieving with RNNs. Essentially all of these are achieved using LSTMs. They really work a lot better for most tasks!

Written down as a set of equations, LSTMs look pretty intimidating. Hopefully, walking through them step by step in this essay has made them a bit more approachable.

LSTMs were a big step in what we can accomplish with RNNs. It’s natural to wonder: is there another big step? A common opinion among researchers is: “Yes! There is a next step and it’s attention!” The idea is to let every step of an RNN pick information to look at from some larger collection of information. For example, if you are using an RNN to create a caption describing an image, it might pick a part of the image to look at for every word it outputs. In fact, Xu, et al. (2015) do exactly this – it might be a fun starting point if you want to explore attention! There’s been a number of really exciting results using attention, and it seems like a lot more are around the corner…

Attention isn’t the only exciting thread in RNN research. For example, Grid LSTMs by Kalchbrenner, et al. (2015) seem extremely promising. Work using RNNs in generative models – such as Gregor, et al. (2015), Chung, et al. (2015), or Bayer & Osendorfer (2015) – also seems very interesting. The last few years have been an exciting time for recurrent neural networks, and the coming ones promise to only be more so!

ML的一些数学函数

1. argmax函数：返回函数值最大的那个自变量值；
2. 似然函数：

L(θ|x)=P(X=x|θ).

3. softmax函数：

$y$
=
[

y
<>

,

,

ym

]

JS学习

// 学习问题：
//1、小括号内的函数+call(this)是什么意思？
//2、函数后面带点是什么意思？
//答：函数的定义结果是一个变量对象，因此可以后面加.操作，进行某项操作；call和apply是默认的method，用于非直接调用；
//3、小括号是什么意思？
//答：我理解就是执行这个函数的意思，这个函数只定义1次，使用1次；这是否是JS的模块的导入和初始化的标准方式？

// 学习总结：
//1、函数本身是一个对象，能够做赋值操作，函数的定义通过赋值完成，例如：var f=function(x){return x*x}，注意这个案例没有定义函数时没有函数名
//2、函数的定义也可与传统C、python的模式类似，例如，没有 func xxx，def xxx，这样的语句，例如：function f(x) {return x*x;}
//3、对象的in操作可以用来判断属性对否在对象中？ 例如：“x” in obj
//4、类的instanceof可用来判断是否对象是创建自该类
//5、if case 语句基本与C语法相同，循环语句也类似，for语句有一个特例，例如：for(var p in o) ；
//6、跳转语句用于异常处理，需要关注一下；
//7、JS不检查函数的参数类型，也不检查函数的参数数量；对于未提供的参数，有一个小技巧，解决默认值问题，例如：a = a || [];
//8、对于提供参数多余定义的情况，可取arguments对象来获取参数；注意arguments对象的两个属性callee和caller；
//9、在没有类的情况下创建对象，案例：
// var empty = {}; // An object with no properties
// var point = { x:0, y:0 }; // Two properties
// var point2 = { x:point.x, y:point.y+1 }; // More complex values
//10、JS的类定义，没有class这样的语法，统一通过function语句来完成，很有特点
//11、类的实例化，通过创建对象完成，对象的创建来自root对象，例如：
// if(Object.create)
// return Object.create(p);

四招教你搞定跨部门项目管理

矩阵结构的经理人必须具有四种技能才可能成功，它们是：理解他人、冲突管理、影响力和自觉能力。核心的五个价值观之一是“让我来”。领导者展示这样的核心价值是非常重要的。绝不要推卸责任，要负责地领导团队。

机器视觉开源代码集合（转载）

• Normalized Cut [1] [Matlab code]
• Gerg Mori’ Superpixel code [2] [Matlab code]
• Efficient Graph-based Image Segmentation [3] [C++ code] [Matlab wrapper]
• Mean-Shift Image Segmentation [4] [EDISON C++ code] [Matlab wrapper]
• OWT-UCM Hierarchical Segmentation [5] [Resources]
• Turbepixels [6] [Matlab code 32bit] [Matlab code 64bit] [Updated code]
• Quick-Shift [7] [VLFeat]
• SLIC Superpixels [8] [Project]
• Segmentation by Minimum Code Length [9] [Project]
• Biased Normalized Cut [10] [Project]
• Segmentation Tree [11-12] [Project]
• Entropy Rate Superpixel Segmentation [13] [Code]
• Fast Approximate Energy Minimization via Graph Cuts[Paper][Code]
• Efﬁcient Planar Graph Cuts with Applications in Computer Vision[Paper][Code]
• Isoperimetric Graph Partitioning for Image Segmentation[Paper][Code]
• Random Walks for Image Segmentation[Paper][Code]
• Blossom V: A new implementation of a minimum cost perfect matching algorithm[Code]
• An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Computer Vision[Paper][Code]
• Geodesic Star Convexity for Interactive Image Segmentation[Project]
• Contour Detection and Image Segmentation Resources[Project][Code]
• Biased Normalized Cuts[Project]
• Max-flow/min-cut[Project]
• Chan-Vese Segmentation using Level Set[Project]
• A Toolbox of Level Set Methods[Project]
• Re-initialization Free Level Set Evolution via Reaction Diffusion[Project]
• Improved C-V active contour model[Paper][Code]
• A Variational Multiphase Level Set Approach to Simultaneous Segmentation and Bias Correction[Paper][Code]
• Level Set Method Research by Chunming Li[Project]
• ClassCut for Unsupervised Class Segmentation[code]
• SEEDS: Superpixels Extracted via Energy-Driven Sampling [Project][other]

• A simple object detector with boosting [Project]
• INRIA Object Detection and Localization Toolkit [1] [Project]
• Discriminatively Trained Deformable Part Models [2] [Project]
• Cascade Object Detection with Deformable Part Models [3] [Project]
• Poselet [4] [Project]
• Implicit Shape Model [5] [Project]
• Viola and Jones’s Face Detection [6] [Project]
• Bayesian Modelling of Dyanmic Scenes for Object Detection[Paper][Code]
• Hand detection using multiple proposals[Project]
• Color Constancy, Intrinsic Images, and Shape Estimation[Paper][Code]
• Discriminatively trained deformable part models[Project]
• Gradient Response Maps for Real-Time Detection of Texture-Less Objects: LineMOD [Project]
• Image Processing On Line[Project]
• Robust Optical Flow Estimation[Project]
• Where’s Waldo: Matching People in Images of Crowds[Project]
• Scalable Multi-class Object Detection[Project]
• Class-Specific Hough Forests for Object Detection[Project]
• Deformed Lattice Detection In Real-World Images[Project]
• Discriminatively trained deformable part models[Project]

• Itti, Koch, and Niebur’ saliency detection [1] [Matlab code]
• Frequency-tuned salient region detection [2] [Project]
• Saliency detection using maximum symmetric surround [3] [Project]
• Attention via Information Maximization [4] [Matlab code]
• Context-aware saliency detection [5] [Matlab code]
• Graph-based visual saliency [6] [Matlab code]
• Saliency detection: A spectral residual approach. [7] [Matlab code]
• Segmenting salient objects from images and videos. [8] [Matlab code]
• Saliency Using Natural statistics. [9] [Matlab code]
• Discriminant Saliency for Visual Recognition from Cluttered Scenes. [10] [Code]
• Learning to Predict Where Humans Look [11] [Project]
• Global Contrast based Salient Region Detection [12] [Project]
• Bayesian Saliency via Low and Mid Level Cues[Project]
• Top-Down Visual Saliency via Joint CRF and Dictionary Learning[Paper][Code]
• Saliency Detection: A Spectral Residual Approach[Code]

• Pyramid Match [1] [Project]
• Spatial Pyramid Matching [2] [Code]
• Locality-constrained Linear Coding [3] [Project] [Matlab code]
• Sparse Coding [4] [Project] [Matlab code]
• Texture Classification [5] [Project]
• Multiple Kernels for Image Classification [6] [Project]
• Feature Combination [7] [Project]
• SuperParsing [Code]
• Large Scale Correlation Clustering Optimization[Matlab code]
• Detecting and Sketching the Common[Project]
• Self-Tuning Spectral Clustering[Project][Code]
• User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior[Paper][Code]
• Filters for Texture Classification[Project]
• Multiple Kernel Learning for Image Classification[Project]
• SLIC Superpixels[Project]

• A Closed Form Solution to Natural Image Matting [Code]
• Spectral Matting [Project]
• Learning-based Matting [Code]

• A Forest of Sensors – Tracking Adaptive Background Mixture Models [Project]
• Object Tracking via Partial Least Squares Analysis[Paper][Code]
• Robust Object Tracking with Online Multiple Instance Learning[Paper][Code]
• Online Visual Tracking with Histograms and Articulating Blocks[Project]
• Incremental Learning for Robust Visual Tracking[Project]
• Real-time Compressive Tracking[Project]
• Robust Object Tracking via Sparsity-based Collaborative Model[Project]
• Visual Tracking via Adaptive Structural Local Sparse Appearance Model[Project]
• Online Discriminative Object Tracking with Local Sparse Representation[Paper][Code]
• Superpixel Tracking[Project]
• Learning Hierarchical Image Representation with Sparsity, Saliency and Locality[Paper][Code]
• Online Multiple Support Instance Tracking [Paper][Code]
• Visual Tracking with Online Multiple Instance Learning[Project]
• Object detection and recognition[Project]
• Compressive Sensing Resources[Project]
• Robust Real-Time Visual Tracking using Pixel-Wise Posteriors[Project]
• Tracking-Learning-Detection[Project][OpenTLD/C++ Code]
• the HandVu：vision-based hand gesture interface[Project]
• Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities[Project]

• 3D Reconstruction of a Moving Object[Paper] [Code]
• Shape From Shading Using Linear Approximation[Code]
• Combining Shape from Shading and Stereo Depth Maps[Project][Code]
• Shape from Shading: A Survey[Paper][Code]
• A Spatio-Temporal Descriptor based on 3D Gradients (HOG3D)[Project][Code]
• Multi-camera Scene Reconstruction via Graph Cuts[Paper][Code]
• A Fast Marching Formulation of Perspective Shape from Shading under Frontal Illumination[Paper][Code]
• Reconstruction:3D Shape, Illumination, Shading, Reflectance, Texture[Project]
• Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers[Code]
• Learning 3-D Scene Structure from a Single Still Image[Project]

• Matlab class for computing Approximate Nearest Nieghbor (ANN) [Matlab class providing interface toANN library]
• Random Sampling[code]
• Probabilistic Latent Semantic Analysis (pLSA)[Code]
• FASTANN and FASTCLUSTER for approximate k-means (AKM)[Project]
• Fast Intersection / Additive Kernel SVMs[Project]
• SVM[Code]
• Ensemble learning[Project]
• Deep Learning[Net]
• Deep Learning Methods for Vision[Project]
• Neural Network for Recognition of Handwritten Digits[Project]
• Training a deep autoencoder or a classifier on MNIST digits[Project]
• THE MNIST DATABASE of handwritten digits[Project]
• Ersatz：deep neural networks in the cloud[Project]
• Deep Learning [Project]
• sparseLM : Sparse Levenberg-Marquardt nonlinear least squares in C/C++[Project]
• Weka 3: Data Mining Software in Java[Project]
• Invited talk “A Tutorial on Deep Learning” by Dr. Kai Yu (余凯)[Video]
• CNN – Convolutional neural network class[Matlab Tool]
• Yann LeCun’s Publications[Wedsite]
• LeNet-5, convolutional neural networks[Project]
• Training a deep autoencoder or a classifier on MNIST digits[Project]
• Deep Learning 大牛Geoffrey E. Hinton’s HomePage[Website]
• Multiple Instance Logistic Discriminant-based Metric Learning (MildML) and Logistic Discriminant-based Metric Learning (LDML)[Code]
• Sparse coding simulation software[Project]
• Visual Recognition and Machine Learning Summer School[Software]

• Action Recognition by Dense Trajectories[Project][Code]
• Action Recognition Using a Distributed Representation of Pose and Appearance[Project]
• Recognition Using Regions[Paper][Code]
• 2D Articulated Human Pose Estimation[Project]
• Fast Human Pose Estimation Using Appearance and Motion via Multi-Dimensional Boosting Regression[Paper][Code]
• Estimating Human Pose from Occluded Images[Paper][Code]
• Quasi-dense wide baseline matching[Project]
• ChaLearn Gesture Challenge: Principal motion: PCA-based reconstruction of motion histograms[Project]
• Real Time Head Pose Estimation with Random Regression Forests[Project]
• 2D Action Recognition Serves 3D Human Pose Estimation[Project]
• A Hough Transform-Based Voting Framework for Action Recognition[Project]
• Motion Interchange Patterns for Action Recognition in Unconstrained Videos[Project]
• 2D articulated human pose estimation software[Project]
• Learning and detecting shape models [code]
• Progressive Search Space Reduction for Human Pose Estimation[Project]
• Learning Non-Rigid 3D Shape from 2D Motion[Project]

• Distance Transforms of Sampled Functions[Project]
• The Computer Vision Homepage[Project]
• Efficient appearance distances between windows[code]
• Image Exploration algorithm[code]
• Motion Magnification 运动放大 [Project]
• Bilateral Filtering for Gray and Color Images 双边滤波器 [Project]
• A Fast Approximation of the Bilateral Filter using a Signal Processing Approach [Project]

• EGT: a Toolbox for Multiple View Geometry and Visual Servoing[Project] [Code]
• a development kit of matlab mex functions for OpenCV library[Project]
• Fast Artificial Neural Network Library[Project]

• finger-detection-and-gesture-recognition [Code]
• Hand and Finger Detection using JavaCV[Project]
• Hand and fingers detection[Code]

• Nonparametric Scene Parsing via Label Transfer [Project]

• High accuracy optical flow using a theory for warping [Project]
• Dense Trajectories Video Description [Project]
• SIFT Flow: Dense Correspondence across Scenes and its Applications[Project]
• KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker [Project]
• Tracking Cars Using Optical Flow[Project]
• Secrets of optical flow estimation and their principles[Project]
• implmentation of the Black and Anandan dense optical flow method[Project]
• Optical Flow Computation[Project]
• Beyond Pixels: Exploring New Representations and Applications for Motion Analysis[Project]
• A Database and Evaluation Methodology for Optical Flow[Project]
• optical flow relative[Project]
• Robust Optical Flow Estimation [Project]
• optical flow[Project]

• Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval [Paper][code]

• Markov Random Fields for Super-Resolution [Project]
• A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors [Project]

• Moving Object Extraction, Using Models or Analysis of Regions [Project]
• Background Subtraction: Experiments and Improvements for ViBe [Project]
• A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications [Project]
• changedetection.net: A new change detection benchmark dataset[Project]
• ViBe – a powerful technique for background detection and subtraction in video sequences[Project]
• Background Subtraction Program[Project]
• Motion Detection Algorithms[Project]
• Stuttgart Artificial Background Subtraction Dataset[Project]
• Object Detection, Motion Estimation, and Tracking[Project]

Feature Detection and Description

General Libraries:

• VLFeat – Implementation of various feature descriptors (including SIFT, HOG, and LBP) and covariant feature detectors (including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian, Multiscale Harris). Easy-to-use Matlab interface. See Modern features: Software – Slides providing a demonstration of VLFeat and also links to other software. Check also VLFeat hands-on session training
• OpenCV – Various implementations of modern feature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)

Fast Keypoint Detectors for Real-time Applications:

• FAST – High-speed corner detector implementation for a wide variety of platforms
• AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV 2010).

Binary Descriptors for Real-Time Applications:

• BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
• ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations, but not scale)
• BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
• FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

Other Local Feature Detectors and Descriptors:

• VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
• LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
• Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and rendering style (CVPR 2012).

Global Image Descriptors:

• GIST – Matlab code for the GIST descriptor
• CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

• VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
• Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

• EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on convolutional neural networks.
• Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural networks.
• Deep Learning – Various links for deep learning software.

Part-Based Models

Attributes and Semantic Features

Large-Scale Learning

• Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
• LIBLINEAR – Library for large-scale linear SVM classification.
• VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

• FLANN – Library for performing fast approximate nearest neighbor.
• Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
• ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing (CVPR 2011).
• INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

3D Recognition

Action Recognition

Datasets

Attributes

• Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
• aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
• FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
• PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
• LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
• Human Attributes – 8,000 people with annotated attributes. Check also this link for another dataset of human attributes.
• SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
• ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
• Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for the WhittleSearch data.
• Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Face Detection

• FDDB – UMass face detection dataset and benchmark (5,000+ faces)
• CMU/MIT – Classical face detection dataset.

Face Recognition

• Face Recognition Homepage – Large collection of face recognition datasets.
• LFW – UMass unconstrained face recognition dataset (13,000+ face images).
• NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
• CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
• FERET – Classical face recognition dataset.
• Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale, ORL, PIE, and Extended Yale B.
• SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

• MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Generic Object Recognition

• ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
• Tiny Images – 80 million 32×32 low resolution images.
• Pascal VOC – One of the most influential visual recognition datasets.
• Caltech 101 / Caltech 256 – Popular image datasets containing 101 and 256 object categories, respectively.
• MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

Feature Detection and Description

Action Recognition

RGBD Recognition

Reference:

[1]: http://rogerioferis.com/VisualRecognitionAndSearch/Resources.html

行动力的六大准则

OP哲学原则一：有了创意就要敢付诸行动

OP原则二：有输入必有输出

OP原则三：有三分就输出三分

「完美主义」者其实是害怕不完美，被完美的意识给控制住一切　 因为要求完美、害怕不完美，所以变成凡事要十分的准备才能开始动作， 但是愈准备总会愈觉得不足，但那些都是头脑的想像，在实务者眼里，其 实许多准备都是多余的，除非你开始行动进入实务，你才会明白你真正欠 缺什麽，这样再回头准备才能务实，不断再改进输出的才是真务实，所以 要做行动派的完美主义者，不要做头脑派的完美主义者

“我既开始，我必完成” “我既决定，我必贯彻”

l972年墨西哥奥运会的马拉松比赛曾经出现一个感人的画面。一位黑人选 手在左膝盖受伤的状况下，凭著他的一股意志力，跑完全程，当他到达终点时 ，比赛的名次早己排在记录板上。然而，对他来说，胜负已不是最重要的事了 .当他跑到终点时，一位记者问他：“什麽力量让你坚持一定要跑回终点？” 他回答：“我只是不断告诉自己一定要跑完它！” 这种使命感，使他跑完全程，也让他赢得了全场最热烈的掌声。

l946年人类发明了电脑，我们称之为第一代电脑，当时是以真空管和蓄电器为零件，体积等於现代（第四代）电脑的四百倍大，记忆周期时间为千分之一秒。经过不断地研究发展和突破，第二代电脑改良为电晶体，第三代电脑改良为积体电路，到目前的第四代电脑则为超大型积体电路，体积不到第一代电脑的四百分之一，速度却快到十万倍以上，我们可以肯定，电脑科技会不断地进步，这也代表人类的潜能还在不断发展、突破。

SAN和NAS两种存储的区别

SAN和NAS的区别 SAN结构中，文件管理系统（FS）还是分别在每一个应用服务器上；而NAS则是每个应用服务器通过网络共享协议（如：NFS、CIFS）使用同一个文件管理系统。换句话说：NAS和SAN存储系统的区别是NAS有自己的文件系统管理。 NAS是将目光集中在应用、用户和文件以及它们共享的数据上。SAN是将目光集中在磁盘、磁带以及联接它们的可靠的基础结构。将来从桌面系统到数据集中管理到存储设备的全面解决方案将是NAS加SAN。

1、用户认证

2、MCU融合管理

3、海量请求架构