Monday, September 26, 2016

Guide to Dynamic Programming - I will show the coding result.

Problem: given n, find the number of different ways to write n as the sum of 1, 3, 4

I conclude the method as 4 step solver.

Step 1. Define:
D[n] = number of ways to reach the sum n.

2. Recurrence:
D[n] = D[n - 1] + D[n - 3] + D[n - 4]

3. Base Case
D[0] = 1, D[1] = 1, D[2] = 1, D[3] = 2;
D[negative] = 0,

4. Result Case
D[n]

Time Complexity: O(n * m) - two for loops with n, m length; n is the sum; m is the length of that numbers array.
Space: O(n) - the whole D array

ok, now let's get hands dirty.

Result: [1, 1, 1, 2, 4, 6]
16:09:04:591 [main] INFO com.StandfordDP.Dynamic1 - 6

Reference: http://web.stanford.edu/class/cs97si/04-dynamic-programming.pdf

Sunday, September 25, 2016

BackPack Problems - formula + time complexity

Backpack VI

http://www.lintcode.com/problem/backpack-vi/

这题感觉就是一维的，我也是想了想整两维反而整不出来。
最后解了一维方程如下：
I. ret[i] += ret[i - nums[j - 1]];

II. time complexity还是二维的是O(m * n)

因为每次都可以取任一个数，各种反复取，而且最恶心一点是排列组合顺序不同都要算上。
所以不能直接加，而要用for循环加

外层是m或者说target
内层循环是n
所以time complexity是O(m * n)

Saturday, September 24, 2016

Recommender System

I. Introduction.
core trick - CF - based on similarity

Two CFs -
1. User CF
2. Item CF

Applications - amazon commodities/ netflix movies.

For now, let's talk about movie.

User CF - based on user similarity
Item CF - based on item similarity

Let's use item CF dude,
item have less unique quantities.
items' properties seldom change.
you like marval ant's man, i present you the iron man, sounds convincing.

II. Procedures
1. co-occurrence matrix
2. rating matrix
3. matrix computation to get recommendation result.

Ok, how to tell the similarity between different movies?
I suggest using machine learning e.g. logistic regression -
and that requires feature -

1. users
watching history
rating history
favorite list
2.
movies category
movies

Let's brainstorm -
e.g. if one user rated two movies, these two are related.
calc

value(M1, M2) = count(x_M1 && x_M2) = 2
Rated Matrix

How about ratings?
Rating Matrix

Normalization

Conditional probability matrix not symmetrical

没看过， rating 可以给一个平均分

Q&A
Q: 为什么没看过rating就是0？
A: 这个地方可以改进，没看过， rating 可以给一个平均分
Q: 例如说来了个新用户小白，他的列全是零
Q: 用户没看过的电影rating给0分会不会影响第一个相似性矩阵的真实性？ A: 会的。一般一个更好的做法是给没看过的电影系统平均分

Logistic regression -> Deep learning.